PKhoboko commited on
Commit
8b5746a
·
verified ·
1 Parent(s): e8348e7

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +24 -2
README.md CHANGED
@@ -50,7 +50,7 @@ for large language model in regard to low resourced morphologically rich African
50
 
51
  <!-- Provide the basic links for the model. -->
52
 
53
- - **Repository:** [More Information Needed]
54
  - **Paper [optional]:** https://www.sciencedirect.com/science/article/pii/S2666827025000325
55
  - **Demo [optional]:** [More Information Needed]
56
 
@@ -135,7 +135,29 @@ translator("Translate to Zulu: The cow is eating grass.")
135
 
136
  #### Training Hyperparameters
137
 
138
- - **Training regime:** [More Information Needed] <!--fp32, fp16 mixed precision, bf16 mixed precision, bf16 non-mixed precision, fp16 non-mixed precision, fp8 mixed precision -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
139
 
140
  #### Speeds, Sizes, Times [optional]
141
 
 
50
 
51
  <!-- Provide the basic links for the model. -->
52
 
53
+ - **Repository:** https://github.com/PKhoboko/MSc-Thesis
54
  - **Paper [optional]:** https://www.sciencedirect.com/science/article/pii/S2666827025000325
55
  - **Demo [optional]:** [More Information Needed]
56
 
 
135
 
136
  #### Training Hyperparameters
137
 
138
+ - **Training regime:**
139
+ - peft_config = LoraConfig(
140
+ lora_alpha=16,
141
+ lora_dropout=0.05,
142
+ r=16,
143
+ bias="none",
144
+ task_type="CAUSAL_LM",
145
+ target_modules=['k_proj', 'q_proj', 'v_proj', 'o_proj','gate_proj', 'down_proj', 'up_proj']
146
+ )
147
+
148
+ - TrainingArguments(
149
+ optim="paged_adamw_8bit",
150
+ per_device_train_batch_size=32,
151
+ gradient_accumulation_steps=4,
152
+ log_level="debug",
153
+ save_steps=400,
154
+ logging_steps=10,
155
+ learning_rate=4e-4,
156
+ num_train_epochs=2,
157
+ warmup_steps=100,
158
+ lr_scheduler_type="linear",
159
+ )
160
+
161
 
162
  #### Speeds, Sizes, Times [optional]
163