I converted and quantised this from NVIDIA Parakeet TDT 0.6B V2 (En) as I found https://huggingface.co/istupakov/parakeet-tdt-0.6b-v2-onnx caused errors in text such as unintended word concatenation.


Reconverting the Model

If you need to reconvert from source, I've created a few helper scripts and included them in in conversion_scripts/.

Requirements

  • Python 3.12 (3.13+ has compatibility issues with NeMo dependencies)
  • ~4GB disk space for the source .nemo file (downloaded automatically)

Setup

cd conversion_scripts
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt

PyTorch 2.5 is required as newer versions have breaking changes in the ONNX export API.

Export

The scripts automatically download the source model if not present. Output files go to ./output/.

# Build the mel-spectrogram preprocessor
python build_preprocessor.py

# Export and quantise to int8 (dynamic quantisation - fast)
python export_model.py

# Or use static quantisation with calibration data (better quality)
python export_model.py --quantise=static

# Or export fp32 only (no quantisation)
python export_model.py --no-quantise

Dynamic vs Static Quantisation

Dynamic quantisation (default): Fast, no calibration data needed. Weights are quantised at export time, activations are quantised at runtime. Good balance of speed and quality.

Static quantisation (--quantise=static): Uses 200 LibriSpeech samples to calibrate activation ranges. Both weights and activations are quantised with fixed scales. May provide better accuracy for some use cases but takes longer to export.

Static quantisation outputs to ./output/static/ and includes an external data file (encoder-model.int8.onnx.data) due to the model size.

Output Files

  • encoder-model.int8.onnx - Encoder (~622MB dynamic, ~42MB + ~581MB external data for static)
  • decoder_joint-model.int8.onnx - Decoder + joiner (8.6MB)
  • nemo128.onnx - Mel-spectrogram preprocessor (137KB)
  • vocab.txt - Token vocabulary
  • config.json - Model configuration
  • parakeet-tdt-0.6b-v2-int8.tar.gz - Archive of all model files
Downloads last month
8
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for smcleod/parakeet-tdt-0.6b-v2-int8

Quantized
(4)
this model