🗣️ TinyTTS
Ultra-lightweight English Text-to-Speech — only 1.6M parameters, ~3.4 MB ONNX
This space runs on CPU efficiently and synthesizes high-quality 44.1kHz audio ~53× faster than real-time.
0.5 2
Examples
⚡ Comparison with Other TTS Engines
All numbers are CPU-only on the same Intel Core laptop. Text: "The weather is nice today, and I feel very relaxed."
| ENGINE | Params | Total (s) | Audio (s) | RTFx |
|---|---|---|---|---|
| TinyTTS (ONNX) 🚀 | 1.6M | 0.092 | 4.88 | ~53x |
| Piper (ONNX, 22kHz) | ~63M | 0.112 | 2.91 | ~26x |
| TinyTTS (PyTorch) | 1.6M | 0.272 | 4.88 | ~18x |
| KittenTTS nano | ~10M | 0.286 | 4.87 | ~17x |
| Supertonic (2-step) | ~82M | 0.249 | 3.69 | ~15x |
| Pocket-TTS | 100M | 0.928 | 3.68 | ~4x |
| Kokoro ONNX | 82M | 0.933 | 3.16 | ~3x |
| KittenTTS mini | ~25M | 2.047 | 4.17 | ~2x |
RTFx = Audio Duration ÷ Synthesis Time (higher = faster). TinyTTS achieves the best speed-to-size ratio: only 1.6M params / 3.4 MB ONNX yet ~53× real-time at 44.1kHz.