Vision-Zero: Scalable VLM Self-Improvement via Strategic Gamified Self-Play Paper • 2509.25541 • Published Sep 29 • 140
ExCyTIn-Bench: Evaluating LLM agents on Cyber Threat Investigation Paper • 2507.14201 • Published Jul 14 • 2
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework Paper • 2308.08155 • Published Aug 16, 2023 • 10
LLM-RankFusion: Mitigating Intrinsic Inconsistency in LLM-based Ranking Paper • 2406.00231 • Published May 31, 2024 • 1
Advances and Challenges in Foundation Agents: From Brain-Inspired Intelligence to Evolutionary, Collaborative, and Safe Systems Paper • 2504.01990 • Published Mar 31 • 300
Which Agent Causes Task Failures and When? On Automated Failure Attribution of LLM Multi-Agent Systems Paper • 2505.00212 • Published Apr 30 • 9
A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence Paper • 2507.21046 • Published Jul 28 • 82
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 188