AI & Human Co-Improvement for Safer Co-Superintelligence Paper • 2512.05356 • Published 4 days ago • 4
From Imitation to Discrimination: Toward A Generalized Curriculum Advantage Mechanism Enhancing Cross-Domain Reasoning Tasks Paper • 2512.02580 • Published 6 days ago • 23
Guided Self-Evolving LLMs with Minimal Human Supervision Paper • 2512.02472 • Published 7 days ago • 47
Video Generation Models Are Good Latent Reward Models Paper • 2511.21541 • Published 12 days ago • 45
Agent0-VL: Exploring Self-Evolving Agent for Tool-Integrated Vision-Language Reasoning Paper • 2511.19900 • Published 14 days ago • 46
Insights from the ICLR Peer Review and Rebuttal Process Paper • 2511.15462 • Published 19 days ago • 6
Agent0: Unleashing Self-Evolving Agents from Zero Data via Tool-Integrated Reasoning Paper • 2511.16043 • Published 19 days ago • 105
First Frame Is the Place to Go for Video Content Customization Paper • 2511.15700 • Published 19 days ago • 52
VisPlay: Self-Evolving Vision-Language Models from Images Paper • 2511.15661 • Published 19 days ago • 42
P1: Mastering Physics Olympiads with Reinforcement Learning Paper • 2511.13612 • Published 21 days ago • 132
MarsRL: Advancing Multi-Agent Reasoning System via Reinforcement Learning with Agentic Pipeline Parallelism Paper • 2511.11373 • Published 24 days ago • 12
Black-Box On-Policy Distillation of Large Language Models Paper • 2511.10643 • Published 25 days ago • 46
Too Good to be Bad: On the Failure of LLMs to Role-Play Villains Paper • 2511.04962 • Published Nov 7 • 52