Vision-Language-Vision Auto-Encoder: Scalable Knowledge Distillation from Diffusion Models Paper • 2507.07104 • Published Jul 9 • 45
De-Diffusion Makes Text a Strong Cross-Modal Interface Paper • 2311.00618 • Published Nov 1, 2023 • 23