🏔️ MarianMT English → Atlasic Tamazight (Tachelhit / Central Atlas Tamazight)

This model is a fine-tuned version of Helsinki-NLP/opus-mt-en-ber that translates from English → Atlasic Tamazight (Tachelhit/Central Atlas Tamazight).

📘 Model Overview

Property	Description
Base Model	`Helsinki-NLP/opus-mt-en-ber`
Architecture	MarianMT
Languages	English → Tamazight (Tachelhit / Central Atlas Tamazight)
Fine-tuning Dataset	97K medium-quality synthetic sentence pairs generated by translating English corpora
Training Objective	Sequence-to-sequence translation fine-tuning
Framework	🤗 Transformers
Tokenizer	SentencePiece

🧠 Training Details

Hyperparameter	Value
`per_device_train_batch_size`	16
`per_device_eval_batch_size`	48
`learning_rate`	2e-5
`num_train_epochs`	5
`max_length`	128
`num_beams`	5
`eval_steps`	5000
`save_steps`	5000
`generation_no_repeat_ngram_size`	3
`generation_repetition_penalty`	1.5

Training Environment:

1 × NVIDIA P100 (16 GB) on Kaggle
Total training time: 2 h 4 m 38 s

📈 Evaluation Results

Step	Train Loss	Val Loss	BLEU
5 000	0.453	0.4296	3.24
10 000	0.386	0.3777	4.97
15 000	0.357	0.3546	5.99
20 000	0.334	0.3419	6.60
25 000	0.326	0.3351	7.02

💬 Example Translations

English	Atlasic Tamazight
I will go to school.	Rad ftuɣ s tinml.
What did you say?	Mad tnnit?
I'm not talking to you, I'm talking to him!	Ur dik a s ar sawalɣ!!!
Everyone has a secret face.	Kraygatt yan ila yat tguri.

Hugging Face Space:
👉 ilyasaqit/English-Tamazight-Translator

🪶 Notes

The dataset is synthetic, not manually verified.
The model performs best on short and simple general-domain sentences.
Recommended decoding parameters:
- num_beams=5
- repetition_penalty=1.2–1.5
- no_repeat_ngram_size=3

📚 Citation

If you use this model, please cite:

@misc{marian-en-tamazight-2025,
  title  = {MarianMT English → Atlasic Tamazight (Tachelhit / Central Atlas)},
  year   = {2025},
  url    = {https://huggingface.co/ilyasaqit/stage2_marian_opus_synth_model_kaggle2}
}

Downloads last month: 57

Safetensors

Model size

62.6M params

Tensor type

F32

Model tree for ilyasaqit/opus-mt-en-atlasic_tamazight-synth97k-nmv

Base model

Helsinki-NLP/opus-mt-en-ber

Finetuned

(4)

this model

ilyasaqit
/

opus-mt-en-atlasic_tamazight-synth97k-nmv