Qwen 2.5–1.5B DSP (English–Albanian)

A fine-tuned version of Qwen 2.5–1.5B, adapted to act as a bilingual (English–Albanian)
Digital Signal Processing (DSP) teaching assistant.

This model was trained as part of a university project exploring whether small LLMs can
handle technical DSP reasoning using verified synthetic datasets and QLoRA.

Intended Use

Coursework-level DSP questions (FFT, aliasing, sampling)
Providing concise explanations in English or Albanian
Supporting mixed-language prompts
Acting as a lightweight educational assistant

Not designed for safety-critical or professional engineering scenarios.

Training Overview

Training setup:

QLoRA (4-bit quantisation + LoRA adapters)
Fine-tuning on a bilingual (EN/SQ) DSP dataset
Synthetic and manually written verified examples
Teacher–student refinement using larger models
Training VRAM requirement: ~5–6GB

The training process mirrors the one used for the LLaMA 3.2–1B DSP variant, allowing
a direct architectural comparison.

Dataset

The model was trained on a combined dataset containing:

Manually authored DSP questions in English and Albanian
An English–Albanian DSP glossary
Synthetic tasks generated programmatically:
- FFT bin index calculations
- Aliasing scenarios
- Sampling frequency relations
Verification scripts to recompute all numeric results

Only entries passing all checks were included in the final dataset.

Evaluation (Summary)

This model performs similarly to the fine-tuned LLaMA 3.2–1B DSP model, with small
differences in explanation style and multilingual behaviour.

Pipeline	Accuracy (approx.)
English	~78%
Albanian	~68%
Mixed EN/SQ	~65%

Strengths:

strong numeric reliability
good stability in English
reasonable Albanian output given the dataset size

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Irfanuruchi/qwen2.5-1.5b-dsp-finetuned"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto"
)

prompt = "A tone at 13 kHz is sampled at 20 kHz. What is the aliased frequency?"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License

These weights are based on Qwen/Qwen2.5-1.5B and are therefore governed by the Qwen 2.5 license.

The fine-tuning code and accompanying project materials are released separately under the MIT License on GitHub:

https://github.com/IrfanUruchi/dsp-llm-bilingual-finetuning

Users must comply with:

the Qwen 2.5 license for the model weights

the MIT license for the project code

Acknowledgements

This model was developed as part of the Introduction to Data Science course at South East European University.

Thanks to Professor Nuhi Besimi for guidance and feedback throughout the project.

Downloads last month: 31

GGUF

Model size

2B params

Architecture

qwen2

Hardware compatibility

8-bit

16-bit

Model tree for Irfanuruchi/qwen2.5-1.5b-dsp-finetuned

Base model

Qwen/Qwen2.5-1.5B

Adapter

(365)

this model