5 19 3

wu weijia

weijiawu

AI & ML interests

None yet

Recent Activity

upvoted a paper 8 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

upvoted a paper 10 days ago

Vision Bridge Transformer at Scale

upvoted a paper 10 days ago

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

View all activity

Organizations

upvoted a paper 8 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 8 days ago • 194

upvoted 2 papers 10 days ago

Vision Bridge Transformer at Scale

Paper • 2511.23199 • Published 12 days ago • 43

Z-Image: An Efficient Image Generation Foundation Model with Single-Stream Diffusion Transformer

Paper • 2511.22699 • Published 13 days ago • 179

upvoted a paper 14 days ago

The Image as Its Own Reward: Reinforcement Learning with Adversarial Reward for Image Generation

Paper • 2511.20256 • Published 15 days ago • 26

upvoted a paper 16 days ago

Computer-Use Agents as Judges for Generative User Interface

Paper • 2511.15567 • Published 21 days ago • 51

upvoted a paper 17 days ago

SAM 3: Segment Anything with Concepts

Paper • 2511.16719 • Published 20 days ago • 109

upvoted a paper 19 days ago

SAM 3D: 3Dfy Anything in Images

Paper • 2511.16624 • Published 20 days ago • 106

upvoted a paper 23 days ago

WEAVE: Unleashing and Benchmarking the In-context Interleaved Comprehension and Generation

Paper • 2511.11434 • Published 26 days ago • 44

upvoted a paper 25 days ago

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published 27 days ago • 93

upvoted a paper 27 days ago

ChatGen: Automatic Text-to-Image Generation From FreeStyle Chatting

Paper • 2411.17176 • Published Nov 26, 2024 • 24

upvoted 2 papers about 1 month ago

VCode: a Multimodal Coding Benchmark with SVG as Symbolic Visual Representation

Paper • 2511.02778 • Published Nov 4 • 101

UniLumos: Fast and Unified Image and Video Relighting with Physics-Plausible Feedback

Paper • 2511.01678 • Published Nov 3 • 34

upvoted 2 papers 4 months ago

Reinforcement Learning in Vision: A Survey

Paper • 2508.08189 • Published Aug 11 • 29

Multi-human Interactive Talking Dataset

Paper • 2508.03050 • Published Aug 5 • 9

upvoted a paper 6 months ago

Paper2Poster: Towards Multimodal Poster Automation from Scientific Papers

Paper • 2505.21497 • Published May 27 • 109

upvoted a paper 8 months ago

GPT-ImgEval: A Comprehensive Benchmark for Diagnosing GPT4o in Image Generation

Paper • 2504.02782 • Published Apr 3 • 57

upvoted 2 papers 9 months ago

TPDiff: Temporal Pyramid Video Diffusion Model

Paper • 2503.09566 • Published Mar 12 • 45

Automated Movie Generation via Multi-Agent CoT Planning

Paper • 2503.07314 • Published Mar 10 • 44

upvoted a paper about 1 year ago

ShowUI: One Vision-Language-Action Model for GUI Visual Agent

Paper • 2411.17465 • Published Nov 26, 2024 • 90

wu weijia

AI & ML interests

Recent Activity

Organizations

weijiawu's activity