MMMU

non-profit

https://mmmu-benchmark.github.io/

Activity Feed Request to join this org

AI & ML interests

Multimodal Model Evaluation

Recent Activity

yuexiang96 authored a paper about 18 hours ago

Agent Data Protocol: Unifying Datasets for Diverse, Effective Fine-tuning of LLM Agents

yuexiang96 authored a paper about 18 hours ago

The Tool Decathlon: Benchmarking Language Agents for Diverse, Realistic, and Long-Horizon Task Execution

yuexiang96 authored a paper about 18 hours ago

Simulating Environments with Reasoning Models for Agent Training

View all activity

MMMU 's datasets 2

MMMU/MMMU_Pro

Viewer • Updated Mar 8 • 5.19k • 7.59k • 41

MMMU/MMMU

Viewer • Updated Sep 19, 2024 • 11.6k • 69.7k • 306