DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models Paper • 2512.02556 • Published 8 days ago • 194
Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer Paper • 2511.22699 • Published 13 days ago • 179
The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation Paper • 2511.20256 • Published 15 days ago • 26
Computer-Use Agents as Judges for Generative User Interface Paper • 2511.15567 • Published 21 days ago • 51
WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation Paper • 2511.11434 • Published 26 days ago • 44
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published 27 days ago • 93
ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting Paper • 2411.17176 • Published Nov 26, 2024 • 24
VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation Paper • 2511.02778 • Published Nov 4 • 101
UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback Paper • 2511.01678 • Published Nov 3 • 34
Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers Paper • 2505.21497 • Published May 27 • 109
GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation Paper • 2504.02782 • Published Apr 3 • 57
ShowUI: One Vision-Language-Action Model for GUI Visual Agent Paper • 2411.17465 • Published Nov 26, 2024 • 90