Llama-3.2-3B-Instruction-LoRA-Adapter

This repo contains a LoRA adapter finetuned on Meta Llama 3.2 (3B), trained using QLoRA and SFTTrainer on an instruction dataset derived from the Databricks Dolly instruction corpus.

Model Details

Base Model: meta-llama/Llama-3.2-3B
Model Type: LoRA adapter for causal language modeling
Finetuning Method: LoRA + QLoRA (4-bit quantization)
Trainable Parameters: ~0.33% via LoRA (rank=8)
Quantization: 4-bit NF4
Language: English

Training Data

Dataset: databricks-dolly-1k subset of databricks-dolly-15k
Samples: 982 training, 110 validation
Split: 90% train/ 10% validation
Text Field: "text" (instruction + response pairs)

Training Configuration

LoRA Hyperparameters

{
  "r": 8,
  "lora_alpha": 32,
  "target_modules": ["q_proj", "k_proj", "v_proj", "out_proj", "fc_in", "fc_out", "wte"],
  "lora_dropout": 0.05,
  "bias": "none",
  "task_type": "CAUSAL_LM"
}

Training Hyperparameters

{
  "num_train_epochs": 3,
  "per_device_train_batch_size": 1,
  "gradient_accumulation_steps": 1,
  "learning_rate": 2e-4,
  "weight_decay": 0.001,
  "warmup_ratio": 0.03,
  "lr_scheduler_type": "constant",
  "max_seq_length": 256,
  "optim": "paged_adamw_32bit",
  "gradient_checkpointing": true,
  "eval_strategy": "steps",
  "eval_steps": 100,
  "save_steps": 100,
  "logging_steps": 100,
  "packing": true
}

QLoRA Hyperparameters

{
  "load_in_4bit": true,
  "bnb_4bit_quant_type": "nf4",
  "bnb_4bit_compute_dtype": "float16",
  "bnb_4bit_use_double_quant": false
}

Training Setup

{
  "framework": "PyTorch with Hugging Face Transformers",
  "fine_tuning_method": "LoRA (Low-Rank Adaptation)",
  "quantization": "4-bit NF4 with QLoRA",
  "compute": "Google Colab T4",
  "gpu_memory": "~16GB VRAM",
}

Usage

Installation

pip install torch transformers accelerate peft bitsandbytes

Basic Usage

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model
model_name = "meta-llama/Llama-3.2-3B"
base_model = AutoModelForCausalLM.from_pretrained(
    model_name,
    torch_dtype=torch.float16,
    low_cpu_mem_usage=True,
    return_dict=True,
    device_map="auto",
)

# Load Tokenizer
tokenizer = AutoTokenizer.from_pretrained(
    "MagicaNeko/llama-3b-lora-dolly",
    subfolder="model-ft-tokenizer"
)

# Load LoRA Adapter
model = PeftModel.from_pretrained(
    base_model,
    "MagicaNeko/llama-3b-lora-dolly",
    subfolder="model-ft-lora-adapter"
)

# Merge adapters
model = model.merge_and_unload()

# Inference
prompt = "What is machine learning?"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))```

Training Code

The training notebook is available at: Colab link

License

This LoRA adapter is released as open-source under the Apache 2.0 License. It contains only the adapter weights and does not include any Meta LLaMA 3B base model weights.

You must still comply with the Meta Llama 3 license if using the base model together with this adapter.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for MagicaNeko/llama-3b-lora-dolly

Base model

meta-llama/Llama-3.2-3B-Instruct

Adapter

(571)

this model

MagicaNeko
/

llama-3b-lora-dolly