Upload from GitHub Actions: TruthfulQA translation WIP fd102e9 verified davidpomerenke commited on Jul 4, 2025
Upload from GitHub Actions: Get more results, compute average based on all tasks 98c6811 verified davidpomerenke commited on Jul 2, 2025
Upload from GitHub Actions: Translate MMLU and evaluate 4c5c136 verified davidpomerenke commited on Jun 30, 2025
Upload from GitHub Actions: Correlation plot b0aa389 verified davidpomerenke commited on Jun 30, 2025
Upload from GitHub Actions: Evaluate on autotranslated GSM dataset f3a09a2 verified davidpomerenke commited on Jun 29, 2025
Upload from GitHub Actions: Evaluate Google Translate 338dc9b verified davidpomerenke commited on Jun 28, 2025
Upload from GitHub Actions: added some transparency to model contribution info 0044d85 verified davidpomerenke commited on Jun 7, 2025
Upload from GitHub Actions: Fix linter problems in frontend e8341d2 verified davidpomerenke commited on Jun 6, 2025
Upload from GitHub Actions: More models and languages a73f888 verified davidpomerenke commited on Jun 6, 2025
Upload from GitHub Actions: Results for 50 languages 3dfd880 verified davidpomerenke commited on Jun 6, 2025
Upload from GitHub Actions: Improve UX and style 70582ce verified davidpomerenke commited on Jun 6, 2025
Upload from GitHub Actions: Improve UX and style 53d2039 verified davidpomerenke commited on Jun 6, 2025
Upload from GitHub Actions: Merge remote changes with local frontend updates 760c6c6 verified davidpomerenke commited on Jun 5, 2025
Upload from GitHub Actions: Merge remote changes and apply terminology updates: Commercial->closed-source, Open->open-source ebaf279 verified davidpomerenke commited on Jun 4, 2025
Upload from GitHub Actions: Use task subset for average score b1e5b40 verified davidpomerenke commited on Jun 4, 2025
Upload from GitHub Actions: Eavaluate on 40 languages 941d5c5 verified davidpomerenke commited on Jun 4, 2025
Upload from GitHub Actions: Make community links work, add CONTRIBUTING 3f60023 verified davidpomerenke commited on Jun 4, 2025
Upload from GitHub Actions: Add math benchmarks 549360a verified davidpomerenke commited on May 22, 2025
Upload from GitHub Actions: Update model ranking fetching f840423 verified davidpomerenke commited on May 22, 2025
Upload from GitHub Actions: Use FLORES+ via Huggingface 913253a verified davidpomerenke commited on May 22, 2025