I converted and quantised this from NVIDIA Parakeet TDT 0.6B V2 (En) as I found https://huggingface.co/istupakov/parakeet-tdt-0.6b-v2-onnx caused errors in text such as unintended word concatenation.

Reconverting the Model

If you need to reconvert from source, I've created a few helper scripts and included them in in conversion_scripts/.

Requirements

Python 3.12 (3.13+ has compatibility issues with NeMo dependencies)
~4GB disk space for the source .nemo file (downloaded automatically)

Setup

cd conversion_scripts
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

PyTorch 2.5 is required as newer versions have breaking changes in the ONNX export API.

Export

The scripts automatically download the source model if not present. Output files go to ./output/.

# Build the mel-spectrogram preprocessor
python build_preprocessor.py

# Export and quantise to int8 (dynamic quantisation - fast)
python export_model.py

# Or use static quantisation with calibration data (better quality)
python export_model.py --quantise=static

# Or export fp32 only (no quantisation)
python export_model.py --no-quantise

Dynamic vs Static Quantisation

Dynamic quantisation (default): Fast, no calibration data needed. Weights are quantised at export time, activations are quantised at runtime. Good balance of speed and quality.

Static quantisation (--quantise=static): Uses 200 LibriSpeech samples to calibrate activation ranges. Both weights and activations are quantised with fixed scales. May provide better accuracy for some use cases but takes longer to export.

Static quantisation outputs to ./output/static/ and includes an external data file (encoder-model.int8.onnx.data) due to the model size.

Output Files

encoder-model.int8.onnx - Encoder (~622MB dynamic, ~42MB + ~581MB external data for static)
decoder_joint-model.int8.onnx - Decoder + joiner (8.6MB)
nemo128.onnx - Mel-spectrogram preprocessor (137KB)
vocab.txt - Token vocabulary
config.json - Model configuration
parakeet-tdt-0.6b-v2-int8.tar.gz - Archive of all model files

Downloads last month: 8

Model tree for smcleod/parakeet-tdt-0.6b-v2-int8

Base model

nvidia/parakeet-tdt-0.6b-v2

Quantized

(4)

this model