DocReward: A Document Reward Model for Structuring and Stylizing Paper โข 2510.11391 โข Published Oct 13 โข 27
ScaleCUA: Scaling Open-Source Computer Use Agents with Cross-Platform Data Paper โข 2509.15221 โข Published Sep 18 โข 111
Beyond Pass@1: Self-Play with Variational Problem Synthesis Sustains RLVR Paper โข 2508.14029 โข Published Aug 19 โข 118
OmniPart: Part-Aware 3D Generation with Semantic Decoupling and Structural Cohesion Paper โข 2507.06165 โข Published Jul 8 โข 58
Geometry Forcing: Marrying Video Diffusion and 3D Representation for Consistent World Modeling Paper โข 2507.07982 โข Published Jul 10 โข 33
ImgEdit: A Unified Image Editing Dataset and Benchmark Paper โข 2505.20275 โข Published May 26 โข 18
MineWorld: a Real-Time and Open-Source Interactive World Model on Minecraft Paper โข 2504.08388 โข Published Apr 11 โข 42
An Empirical Study of GPT-4o Image Generation Capabilities Paper โข 2504.05979 โข Published Apr 8 โข 64
Large Motion Video Autoencoding with Cross-modal Video VAE Paper โข 2412.17805 โข Published Dec 23, 2024 โข 24
RoLoRA Collection [EMNLP2024] RoLoRA: Fine-tuning Rotated Outlier-free LLMs for Effective Weight-Activation Quantization โข 3 items โข Updated Sep 26, 2024 โข 3
WonderJourney: Going from Anywhere to Everywhere Paper โข 2312.03884 โข Published Dec 6, 2023 โข 1
MM-Ego: Towards Building Egocentric Multimodal LLMs Paper โข 2410.07177 โข Published Oct 9, 2024 โข 22
3DGS-DET: Empower 3D Gaussian Splatting with Boundary Guidance and Box-Focused Sampling for 3D Object Detection Paper โข 2410.01647 โข Published Oct 2, 2024 โข 31