Estonian POS Tagging Model (XLM-RoBERTa-Base)

This model is a fine-tuned version of FacebookAI/xlm-roberta-base for Estonian Part-of-Speech (POS) tagging.
It is trained on the Universal Dependencies Treebank (UDT), specifically:

The model provides strong token-level linguistic annotation performance and can be used for downstream Estonian NLP tasks.

Evaluation Results

POS tagging accuracy on UDT test datasets (EDT + EWT): 0.9775

Label Precision Recall F1-score Support
ADJ 0.96 0.96 0.96 4902
ADP 0.95 0.97 0.96 1134
ADV 0.97 0.98 0.97 6511
AUX 0.98 0.99 0.98 3409
CCONJ 0.99 0.99 0.99 2510
DET 0.91 0.93 0.92 1142
INTJ 0.85 0.83 0.84 129
NOUN 0.98 0.98 0.98 15336
NUM 0.97 0.95 0.96 1104
PRON 0.98 0.98 0.98 3425
PROPN 0.96 0.95 0.96 3805
PUNCT 1.00 1.00 1.00 9939
SCONJ 0.98 0.98 0.98 1459
SYM 0.85 0.73 0.79 63
VERB 0.99 0.98 0.98 6746
X 0.78 0.53 0.63 75
Accuracy โ€” โ€” 0.98 61689
Macro avg 0.94 0.92 0.93 61689
Weighted avg 0.98 0.98 0.98 61689

Downloads last month
18
Safetensors
Model size
0.3B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for PitchayaS/xlm-roberta-estonian-pos

Finetuned
(3612)
this model

Dataset used to train PitchayaS/xlm-roberta-estonian-pos