Commit
·
939ee59
1
Parent(s):
78fb48b
Initial
Browse files
README.md
CHANGED
|
@@ -137,42 +137,44 @@ print(tokenizer.decode(outputs[0], skip_special_tokens=True))
|
|
| 137 |
| bnb_4bit_quant_type | `nf4` |
|
| 138 |
| bnb_4bit_use_double_quant | `true` |
|
| 139 |
|
| 140 |
-
|
|
|
|
|
|
|
| 141 |
|
| 142 |
-
|
| 143 |
-
|
| 144 |
-
|
| 145 |
-
|
| 146 |
-
|
| 147 |
-
|
|
|
|
|
|
|
| 148 |
|
| 149 |
-
|
| 150 |
|
| 151 |
-
|
| 152 |
-
|
|
|
|
|
|
|
| 153 |
|
| 154 |
#### Speeds, Sizes, Times
|
| 155 |
|
| 156 |
-
|
| 157 |
-
|
| 158 |
-
|
| 159 |
-
|
| 160 |
-
|
| 161 |
-
- `checkpoint-24000` *(final checkpoint)*
|
| 162 |
|
| 163 |
#### Compute Infrastructure
|
| 164 |
|
| 165 |
-
|
| 166 |
-
|
| 167 |
-
|
| 168 |
-
|
| 169 |
-
|
| 170 |
-
|
| 171 |
-
|
| 172 |
-
|
| 173 |
-
- OS: **Ubuntu 22.04**
|
| 174 |
-
- Frameworks: **PyTorch 2.4.0**
|
| 175 |
-
- CUDA Version: **12.4.1**
|
| 176 |
|
| 177 |
---
|
| 178 |
|
|
|
|
| 137 |
| bnb_4bit_quant_type | `nf4` |
|
| 138 |
| bnb_4bit_use_double_quant | `true` |
|
| 139 |
|
| 140 |
+
Aşağıda her başlık için ayrı birer tablo oluşturdum:
|
| 141 |
+
|
| 142 |
+
#### Dataset
|
| 143 |
|
| 144 |
+
| Parameter | Value |
|
| 145 |
+
|----------------------|--------------------------------|
|
| 146 |
+
| Dataset Name | `nvidia/OpenCodeReasoning` |
|
| 147 |
+
| Split | `split_0` |
|
| 148 |
+
| Number of Rows | `8000` |
|
| 149 |
+
| Max Token Length | `8192` |
|
| 150 |
+
| Shuffle | `True` |
|
| 151 |
+
| Number of Processes | `4` |
|
| 152 |
|
| 153 |
+
#### Tokenizer
|
| 154 |
|
| 155 |
+
| Parameter | Value |
|
| 156 |
+
|--------------------------------|-------------------------------|
|
| 157 |
+
| Truncation | Enabled (`max_length=8192`) |
|
| 158 |
+
| Masked Language Modeling (MLM) | `False` |
|
| 159 |
|
| 160 |
#### Speeds, Sizes, Times
|
| 161 |
|
| 162 |
+
| Parameter | Value |
|
| 163 |
+
|-------------------------|-------------------------------------------------------------|
|
| 164 |
+
| Total Training Time | ~3.5 hours |
|
| 165 |
+
| Checkpoint Frequency | every `10000` steps |
|
| 166 |
+
| Checkpoint Steps | `checkpoint-10000`, `checkpoint-20000`, `checkpoint-24000` |
|
|
|
|
| 167 |
|
| 168 |
#### Compute Infrastructure
|
| 169 |
|
| 170 |
+
| Parameter | Value |
|
| 171 |
+
|--------------|--------------------------------------|
|
| 172 |
+
| GPU | 1 × NVIDIA H100 SXM (80 GB VRAM) |
|
| 173 |
+
| RAM | 125 GB |
|
| 174 |
+
| CPU | 16 vCPU |
|
| 175 |
+
| OS | Ubuntu 22.04 |
|
| 176 |
+
| Frameworks | PyTorch 2.4.0 |
|
| 177 |
+
| CUDA Version | 12.4.1 |
|
|
|
|
|
|
|
|
|
|
| 178 |
|
| 179 |
---
|
| 180 |
|