InSight-o3: Empowering Multimodal Foundation Models with Generalized Visual Search Paper • 2512.18745 • Published 9 days ago • 9
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer Paper • 2408.06072 • Published Aug 12, 2024 • 38
ControlNeXt: Powerful and Efficient Control for Image and Video Generation Paper • 2408.06070 • Published Aug 12, 2024 • 55
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery Paper • 2408.06292 • Published Aug 12, 2024 • 127