A newer version of the Gradio SDK is available:
6.1.0
metadata
title: Arabic Function Calling Leaderboard
emoji: ๐
colorFrom: green
colorTo: blue
sdk: gradio
sdk_version: 4.44.0
app_file: app.py
pinned: true
license: apache-2.0
tags:
- arabic
- function-calling
- leaderboard
- llm-evaluation
๐ Arabic Function Calling Leaderboard
ููุญุฉ ุชูููู ุงุณุชุฏุนุงุก ุงูุฏูุงู ุจุงูุนุฑุจูุฉ
Overview
The Arabic Function Calling Leaderboard (AFCL) evaluates Large Language Models on their ability to:
- Understand Arabic queries (MSA + Dialects)
- Select appropriate functions from available options
- Extract correct arguments from Arabic text
- Handle parallel and complex function calls
- Detect when no function should be called
Models Evaluated
- Arabic-Native: Jais, ALLaM, SILMA, AceGPT
- Multilingual: Qwen, Llama, Gemma, Mistral, Phi, BLOOMZ, Aya
Dataset
๐ Dataset: HeshamHaroon/Arabic_Function_Calling
- 1,470 total samples across 10 categories
- Simple, Multiple, Parallel, Parallel Multiple
- Irrelevance Detection
- Dialect Handling (Egyptian, Gulf, Levantine)
Evaluation
The leaderboard automatically evaluates models using the HuggingFace Inference API when the Space starts.
Citation
@misc{afcl2024,
title={Arabic Function Calling Leaderboard},
author={Hesham Haroon},
year={2024},
url={https://huggingface.co/spaces/HeshamHaroon/Arabic-Function-Calling-Leaderboard}
}