Update README.md
Browse files
README.md
CHANGED
|
@@ -18,11 +18,11 @@ pipeline_tag: text-generation
|
|
| 18 |
* A demonstration notebook is available on Google Colab (click the badge below). Please note that the training code has been omitted from this notebook. It is intended solely for testing and inference using the latest checkpoint.
|
| 19 |
[](https://drive.google.com/file/d/1Q4jtRjIkFWIAM82pAg4OBPCLjpQ8ndpI/view?usp=sharing)
|
| 20 |
|
| 21 |
-
* Note: The
|
| 22 |
|
| 23 |
-
* Apologies for the current state of the project. The initial version has some inconsistencies,
|
| 24 |
|
| 25 |
-
* The guide below provides an explanation of the code presented in the notebook.
|
| 26 |
|
| 27 |
## Installation
|
| 28 |
|
|
|
|
| 18 |
* A demonstration notebook is available on Google Colab (click the badge below). Please note that the training code has been omitted from this notebook. It is intended solely for testing and inference using the latest checkpoint.
|
| 19 |
[](https://drive.google.com/file/d/1Q4jtRjIkFWIAM82pAg4OBPCLjpQ8ndpI/view?usp=sharing)
|
| 20 |
|
| 21 |
+
* Note: The initial training was conducted on a dataset with errors rather than a perfectly preprocessed one—<span style="color:red;">**garbage in, garbage out**</span>. As a result, while the model successfully adheres to the desired YAML format and demonstrates structured reasoning, its performance remains <span style="color:red;">**unstable**</span>. Future iterations will focus on retraining with a <span style="color:red;">**more extensive, high-quality dataset**</span> to improve stability and accuracy.
|
| 22 |
|
| 23 |
+
* Apologies for the current state of the project. The initial version has some inconsistencies due to training on the old dataset, [tuandunghcmut/normal_dataset](https://huggingface.co/datasets/tuandunghcmut/normal_dataset). Future plans include refactoring the code into a more structured format, expanding the dataset to the new one, [tuandunghcmut/coding-mcq-reasoning](https://huggingface.co/datasets/tuandunghcmut/coding-mcq-reasoning), and retraining the model using distributed training for improved scalability. Additionally, I plan to train on a larger, high-quality dataset to enhance performance and ensure better stability.
|
| 24 |
|
| 25 |
+
* The guide below provides an explanation of the code presented in the notebook. I hope you will understand my ideas and the structure of the code.
|
| 26 |
|
| 27 |
## Installation
|
| 28 |
|