I converted and quantised this from NVIDIA Parakeet TDT 0.6B V2 (En) as I found https://huggingface.co/istupakov/parakeet-tdt-0.6b-v2-onnx caused errors in text such as unintended word concatenation.
Reconverting the Model
If you need to reconvert from source, I've created a few helper scripts and included them in in conversion_scripts/.
Requirements
- Python 3.12 (3.13+ has compatibility issues with NeMo dependencies)
- ~4GB disk space for the source .nemo file (downloaded automatically)
Setup
cd conversion_scripts
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
PyTorch 2.5 is required as newer versions have breaking changes in the ONNX export API.
Export
The scripts automatically download the source model if not present. Output files go to ./output/.
# Build the mel-spectrogram preprocessor
python build_preprocessor.py
# Export and quantise to int8 (dynamic quantisation - fast)
python export_model.py
# Or use static quantisation with calibration data (better quality)
python export_model.py --quantise=static
# Or export fp32 only (no quantisation)
python export_model.py --no-quantise
Dynamic vs Static Quantisation
Dynamic quantisation (default): Fast, no calibration data needed. Weights are quantised at export time, activations are quantised at runtime. Good balance of speed and quality.
Static quantisation (--quantise=static): Uses 200 LibriSpeech samples to calibrate activation ranges. Both weights and activations are quantised with fixed scales. May provide better accuracy for some use cases but takes longer to export.
Static quantisation outputs to ./output/static/ and includes an external data file (encoder-model.int8.onnx.data) due to the model size.
Output Files
encoder-model.int8.onnx- Encoder (~622MB dynamic, ~42MB + ~581MB external data for static)decoder_joint-model.int8.onnx- Decoder + joiner (8.6MB)nemo128.onnx- Mel-spectrogram preprocessor (137KB)vocab.txt- Token vocabularyconfig.json- Model configurationparakeet-tdt-0.6b-v2-int8.tar.gz- Archive of all model files
- Downloads last month
- 8
Model tree for smcleod/parakeet-tdt-0.6b-v2-int8
Base model
nvidia/parakeet-tdt-0.6b-v2