WorldGen: From Text to Traversable and Interactive 3D Worlds Paper • 2511.16825 • Published 18 days ago • 21
STARFlow-V: End-to-End Video Generative Modeling with Normalizing Flow Paper • 2511.20462 • Published 13 days ago • 29
Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens Paper • 2511.19418 • Published 14 days ago • 26
LLaVA-OneVision-1.5: Fully Open Framework for Democratized Multimodal Training Paper • 2509.23661 • Published Sep 28 • 46
A Vision-Language-Action-Critic Model for Robotic Real-World Reinforcement Learning Paper • 2509.15937 • Published Sep 19 • 20
The Well: a Large-Scale Collection of Diverse Physics Simulations for Machine Learning Paper • 2412.00568 • Published Nov 30, 2024 • 23
A Survey of Reinforcement Learning for Large Reasoning Models Paper • 2509.08827 • Published Sep 10 • 189
ThinkAct: Vision-Language-Action Reasoning via Reinforced Visual Latent Planning Paper • 2507.16815 • Published Jul 22 • 39