Qwen 2.5–1.5B DSP (English–Albanian)
A fine-tuned version of Qwen 2.5–1.5B, adapted to act as a bilingual (English–Albanian)
Digital Signal Processing (DSP) teaching assistant.
This model was trained as part of a university project exploring whether small LLMs can
handle technical DSP reasoning using verified synthetic datasets and QLoRA.
Intended Use
- Coursework-level DSP questions (FFT, aliasing, sampling)
- Providing concise explanations in English or Albanian
- Supporting mixed-language prompts
- Acting as a lightweight educational assistant
Not designed for safety-critical or professional engineering scenarios.
Training Overview
Training setup:
- QLoRA (4-bit quantisation + LoRA adapters)
- Fine-tuning on a bilingual (EN/SQ) DSP dataset
- Synthetic and manually written verified examples
- Teacher–student refinement using larger models
- Training VRAM requirement: ~5–6GB
The training process mirrors the one used for the LLaMA 3.2–1B DSP variant, allowing
a direct architectural comparison.
Dataset
The model was trained on a combined dataset containing:
- Manually authored DSP questions in English and Albanian
- An English–Albanian DSP glossary
- Synthetic tasks generated programmatically:
- FFT bin index calculations
- Aliasing scenarios
- Sampling frequency relations
- Verification scripts to recompute all numeric results
Only entries passing all checks were included in the final dataset.
Evaluation (Summary)
This model performs similarly to the fine-tuned LLaMA 3.2–1B DSP model, with small
differences in explanation style and multilingual behaviour.
| Pipeline | Accuracy (approx.) |
|---|---|
| English | ~78% |
| Albanian | ~68% |
| Mixed EN/SQ | ~65% |
Strengths:
- strong numeric reliability
- good stability in English
- reasonable Albanian output given the dataset size
How to Use
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "Irfanuruchi/qwen2.5-1.5b-dsp-finetuned"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
device_map="auto"
)
prompt = "A tone at 13 kHz is sampled at 20 kHz. What is the aliased frequency?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
License
These weights are based on Qwen/Qwen2.5-1.5B and are therefore governed by the Qwen 2.5 license.
The fine-tuning code and accompanying project materials are released separately under the MIT License on GitHub:
https://github.com/IrfanUruchi/dsp-llm-bilingual-finetuning
Users must comply with:
the Qwen 2.5 license for the model weights
the MIT license for the project code
Acknowledgements
This model was developed as part of the Introduction to Data Science course at South East European University.
Thanks to Professor Nuhi Besimi for guidance and feedback throughout the project.
- Downloads last month
- 31
8-bit
16-bit
Model tree for Irfanuruchi/qwen2.5-1.5b-dsp-finetuned
Base model
Qwen/Qwen2.5-1.5B