Diffusers
Safetensors
English

Co-GRPO: Co-Optimized Group Relative Policy Optimization for Masked Diffusion Model

Paper | Model | Code | Project Page teaser

Introduction

We bridge the gap between single-step training and multi-step inference in Masked Diffusion Models by introducing Co-GRPO, a framework that cooperatively optimizes both the generative model and the inference schedule using reinforcement learning, achieving superior visual quality and alignment without costly multi-step backpropagation.

Usage

Please refer to github link.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for rpzhou/Co-GRPO-Meissonic-1B

Finetuned
(2)
this model

Datasets used to train rpzhou/Co-GRPO-Meissonic-1B