Model Card: llm-course-hw3-tinyllama-qlora

This model was fine-tuned as part of Homework 3 in the HSE LLM Course.
It applies QLoRA to a chat-based TinyLlama model for sentiment classification.

The model predicts a sentiment label (negative, neutral, or positive) by generating a short text response conditioned on the input, formatted using the model’s chat template.

Model Sources

Training Method

Training follows a supervised fine-tuning setup in a conversational format.

Each training example is converted into a dialogue of the form:

{"role": "system", "content": "<instruction>"},
{"role": "user", "content": "Text: <input text>"},
{"role": "assistant", "content": "<sentiment label>"}

Training Hyperparameters

  • Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0

  • PEFT method: QLoRA (LoRA + 4-bit quantization)

  • Quantization: NF4, 4-bit

  • Compute dtype: FP16

  • LoRA rank: 8

  • LoRA alpha: 16

  • LoRA dropout: 0.05

  • Target modules: attention projections (q_proj, k_proj, v_proj, o_proj)

  • Batch size (per device): 16

  • Gradient accumulation steps: 2

  • Learning rate: 2e-4

  • Optimizer: paged_adamw_8bit

  • LR scheduler: cosine

  • Epochs: 1

Results

  • Macro F1 (test set): ~0.54
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sodeniZz/llm-course-hw3-tinyllama-qlora

Finetuned
(469)
this model

Dataset used to train sodeniZz/llm-course-hw3-tinyllama-qlora

Collection including sodeniZz/llm-course-hw3-tinyllama-qlora