Update README.md
Browse files
README.md
CHANGED
|
@@ -9,16 +9,7 @@ base_model:
|
|
| 9 |
- Qwen/Qwen2.5-Math-7B
|
| 10 |
---
|
| 11 |
|
| 12 |
-
|
| 13 |
-
license: apache-2.0
|
| 14 |
-
library_name: transformers
|
| 15 |
-
pipeline_tag: text-generation
|
| 16 |
-
datasets:
|
| 17 |
-
- Satori-reasoning/Satori_FT_data
|
| 18 |
-
- Satori-reasoning/Satori_RL_data
|
| 19 |
-
base_model:
|
| 20 |
-
- Qwen/Qwen2.5-Math-7B
|
| 21 |
-
---
|
| 22 |
**Satori-7B-Round2** is a 7B LLM trained on open-source model (Qwen-2.5-Math-7B) and open-source data (OpenMathInstruct-2 and NuminaMath). **Satori-7B-Round2** is capable of autoregressive search, i.e., self-reflection and self-exploration without external guidance.
|
| 23 |
This is achieved through our proposed Chain-of-Action-Thought (COAT) reasoning and a two-stage post-training paradigm.
|
| 24 |
|
|
|
|
| 9 |
- Qwen/Qwen2.5-Math-7B
|
| 10 |
---
|
| 11 |
|
| 12 |
+
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
**Satori-7B-Round2** is a 7B LLM trained on open-source model (Qwen-2.5-Math-7B) and open-source data (OpenMathInstruct-2 and NuminaMath). **Satori-7B-Round2** is capable of autoregressive search, i.e., self-reflection and self-exploration without external guidance.
|
| 14 |
This is achieved through our proposed Chain-of-Action-Thought (COAT) reasoning and a two-stage post-training paradigm.
|
| 15 |
|