Update README.md
Browse files
README.md
CHANGED
|
@@ -50,7 +50,7 @@ for large language model in regard to low resourced morphologically rich African
|
|
| 50 |
|
| 51 |
<!-- Provide the basic links for the model. -->
|
| 52 |
|
| 53 |
-
- **Repository:**
|
| 54 |
- **Paper [optional]:** https://www.sciencedirect.com/science/article/pii/S2666827025000325
|
| 55 |
- **Demo [optional]:** [More Information Needed]
|
| 56 |
|
|
@@ -135,7 +135,29 @@ translator("Translate to Zulu: The cow is eating grass.")
|
|
| 135 |
|
| 136 |
#### Training Hyperparameters
|
| 137 |
|
| 138 |
-
- **Training regime:**
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 139 |
|
| 140 |
#### Speeds, Sizes, Times [optional]
|
| 141 |
|
|
|
|
| 50 |
|
| 51 |
<!-- Provide the basic links for the model. -->
|
| 52 |
|
| 53 |
+
- **Repository:** https://github.com/PKhoboko/MSc-Thesis
|
| 54 |
- **Paper [optional]:** https://www.sciencedirect.com/science/article/pii/S2666827025000325
|
| 55 |
- **Demo [optional]:** [More Information Needed]
|
| 56 |
|
|
|
|
| 135 |
|
| 136 |
#### Training Hyperparameters
|
| 137 |
|
| 138 |
+
- **Training regime:**
|
| 139 |
+
- peft_config = LoraConfig(
|
| 140 |
+
lora_alpha=16,
|
| 141 |
+
lora_dropout=0.05,
|
| 142 |
+
r=16,
|
| 143 |
+
bias="none",
|
| 144 |
+
task_type="CAUSAL_LM",
|
| 145 |
+
target_modules=['k_proj', 'q_proj', 'v_proj', 'o_proj','gate_proj', 'down_proj', 'up_proj']
|
| 146 |
+
)
|
| 147 |
+
|
| 148 |
+
- TrainingArguments(
|
| 149 |
+
optim="paged_adamw_8bit",
|
| 150 |
+
per_device_train_batch_size=32,
|
| 151 |
+
gradient_accumulation_steps=4,
|
| 152 |
+
log_level="debug",
|
| 153 |
+
save_steps=400,
|
| 154 |
+
logging_steps=10,
|
| 155 |
+
learning_rate=4e-4,
|
| 156 |
+
num_train_epochs=2,
|
| 157 |
+
warmup_steps=100,
|
| 158 |
+
lr_scheduler_type="linear",
|
| 159 |
+
)
|
| 160 |
+
|
| 161 |
|
| 162 |
#### Speeds, Sizes, Times [optional]
|
| 163 |
|