AdaCoT: Pareto-Optimal Adaptive Chain-of-Thought Triggering via Reinforcement Learning Paper • 2505.11896 • Published May 17 • 58
Thinking Longer, Not Larger: Enhancing Software Engineering Agents via Scaling Test-Time Compute Paper • 2503.23803 • Published Mar 31 • 8
Multi-SWE-bench: A Multilingual Benchmark for Issue Resolving Paper • 2504.02605 • Published Apr 3 • 48
Codev-Bench: How Do LLMs Understand Developer-Centric Code Completion? Paper • 2410.01353 • Published Oct 2, 2024 • 1
OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis Paper • 2501.04561 • Published Jan 8 • 17