Model Card: llm-course-hw3-tinyllama-qlora

This model was fine-tuned as part of Homework 3 in the HSE LLM Course.
It applies QLoRA to a chat-based TinyLlama model for sentiment classification.

The model predicts a sentiment label (negative, neutral, or positive) by generating a short text response conditioned on the input, formatted using the model’s chat template.

Model Sources

Base model: https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0
Dataset: https://huggingface.co/datasets/cardiffnlp/tweet_eval

Training Method

Training follows a supervised fine-tuning setup in a conversational format.

Each training example is converted into a dialogue of the form:

{"role": "system", "content": "<instruction>"},
{"role": "user", "content": "Text: <input text>"},
{"role": "assistant", "content": "<sentiment label>"}

Training Hyperparameters

Base model: TinyLlama/TinyLlama-1.1B-Chat-v1.0
PEFT method: QLoRA (LoRA + 4-bit quantization)
Quantization: NF4, 4-bit
Compute dtype: FP16
LoRA rank: 8
LoRA alpha: 16
LoRA dropout: 0.05
Target modules: attention projections (q_proj, k_proj, v_proj, o_proj)
Batch size (per device): 16
Gradient accumulation steps: 2
Learning rate: 2e-4
Optimizer: paged_adamw_8bit
LR scheduler: cosine
Epochs: 1

Results

Macro F1 (test set): ~0.54

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sodeniZz/llm-course-hw3-tinyllama-qlora

Base model

TinyLlama/TinyLlama-1.1B-Chat-v1.0

Finetuned

(469)

this model

Dataset used to train sodeniZz/llm-course-hw3-tinyllama-qlora

Collection including sodeniZz/llm-course-hw3-tinyllama-qlora

Parameter-Efficient Fine-Tuning (LoRA & DoRa & QLoRA)

Collection

A collection of parameter-efficient fine-tuning experiments for sentiment classification using chat-based instruction tuning • 4 items • Updated 3 days ago