-
Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning
Paper • 2506.01939 • Published • 187 -
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration
Paper • 2511.21689 • Published • 99 -
PretrainZero: Reinforcement Active Pretraining
Paper • 2512.03442 • Published • 41
wenzel zhang
wenzel94
·
AI & ML interests
None yet
Recent Activity
updated
a collection
5 days ago
LLM RL
updated
a collection
5 days ago
LLM RL
upvoted
a
paper
5 days ago
ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration