Model Card: llm-course-hw3-tinyllama-qlora
This model was fine-tuned as part of Homework 3 in the HSE LLM Course.
It applies QLoRA to a chat-based TinyLlama model for sentiment classification.
The model predicts a sentiment label (negative, neutral, or positive) by generating a short text response conditioned on the input, formatted using the model’s chat template.
Model Sources
- Base model: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
- Dataset: https://huggingface.co/datasets/cardiffnlp/tweet_eval
Training Method
Training follows a supervised fine-tuning setup in a conversational format.
Each training example is converted into a dialogue of the form:
{"role": "system", "content": "<instruction>"},
{"role": "user", "content": "Text: <input text>"},
{"role": "assistant", "content": "<sentiment label>"}
Training Hyperparameters
Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
PEFT method: QLoRA (LoRA + 4-bit quantization)
Quantization: NF4, 4-bit
Compute dtype: FP16
LoRA rank: 8
LoRA alpha: 16
LoRA dropout: 0.05
Target modules: attention projections (
q_proj,k_proj,v_proj,o_proj)Batch size (per device): 16
Gradient accumulation steps: 2
Learning rate: 2e-4
Optimizer: paged_adamw_8bit
LR scheduler: cosine
Epochs: 1
Results
- Macro F1 (test set): ~0.54
Model tree for sodeniZz/llm-course-hw3-tinyllama-qlora
Base model
TinyLlama/TinyLlama-1.1B-Chat-v1.0