Co-GRPO: Co-Optimized Group Relative Policy Optimization for Masked Diffusion Model
Paper | Model | Code | Project Page

Introduction
We bridge the gap between single-step training and multi-step inference in Masked Diffusion Models by introducing Co-GRPO, a framework that cooperatively optimizes both the generative model and the inference schedule using reinforcement learning, achieving superior visual quality and alignment without costly multi-step backpropagation.
Usage
Please refer to github link.
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for rpzhou/Co-GRPO-Meissonic-1B
Base model
MeissonFlow/Meissonic