Qwen 2.5–1.5B DSP (English–Albanian)

A fine-tuned version of Qwen 2.5–1.5B, adapted to act as a bilingual (English–Albanian)
Digital Signal Processing (DSP) teaching assistant.

This model was trained as part of a university project exploring whether small LLMs can
handle technical DSP reasoning using verified synthetic datasets and QLoRA.


Intended Use

  • Coursework-level DSP questions (FFT, aliasing, sampling)
  • Providing concise explanations in English or Albanian
  • Supporting mixed-language prompts
  • Acting as a lightweight educational assistant

Not designed for safety-critical or professional engineering scenarios.


Training Overview

Training setup:

  • QLoRA (4-bit quantisation + LoRA adapters)
  • Fine-tuning on a bilingual (EN/SQ) DSP dataset
  • Synthetic and manually written verified examples
  • Teacher–student refinement using larger models
  • Training VRAM requirement: ~5–6GB

The training process mirrors the one used for the LLaMA 3.2–1B DSP variant, allowing
a direct architectural comparison.


Dataset

The model was trained on a combined dataset containing:

  • Manually authored DSP questions in English and Albanian
  • An English–Albanian DSP glossary
  • Synthetic tasks generated programmatically:
    • FFT bin index calculations
    • Aliasing scenarios
    • Sampling frequency relations
  • Verification scripts to recompute all numeric results

Only entries passing all checks were included in the final dataset.


Evaluation (Summary)

This model performs similarly to the fine-tuned LLaMA 3.2–1B DSP model, with small
differences in explanation style and multilingual behaviour.

Pipeline Accuracy (approx.)
English ~78%
Albanian ~68%
Mixed EN/SQ ~65%

Strengths:

  • strong numeric reliability
  • good stability in English
  • reasonable Albanian output given the dataset size

How to Use

from transformers import AutoTokenizer, AutoModelForCausalLM

model_id = "Irfanuruchi/qwen2.5-1.5b-dsp-finetuned"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto"
)

prompt = "A tone at 13 kHz is sampled at 20 kHz. What is the aliased frequency?"

inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))

License

These weights are based on Qwen/Qwen2.5-1.5B and are therefore governed by the Qwen 2.5 license.

The fine-tuning code and accompanying project materials are released separately under the MIT License on GitHub:

https://github.com/IrfanUruchi/dsp-llm-bilingual-finetuning

Users must comply with:

the Qwen 2.5 license for the model weights

the MIT license for the project code


Acknowledgements

This model was developed as part of the Introduction to Data Science course at South East European University.

Thanks to Professor Nuhi Besimi for guidance and feedback throughout the project.

Downloads last month
31
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to view the estimation

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Irfanuruchi/qwen2.5-1.5b-dsp-finetuned

Base model

Qwen/Qwen2.5-1.5B
Adapter
(365)
this model