Fastest predefined-voice: Kokoro (cuda) — 97.68× warm RTF, 73ms warm TTFA
Fastest cloning-capable: OmniVoice (cuda) — 8.35× warm RTF, 759ms warm TTFA
| Model | Device | TTFA cold | TTFA warm | RTF cold | RTF warm | Peak RAM | Peak VRAM | Size |
|---|---|---|---|---|---|---|---|---|
| Kokoro | cuda | 491ms | 73ms | 15.15× | 97.68× | 1.85 GB | 879 MB | 82M |
| Piper | cpu | 221ms | 180ms | 28.71× | 37.14× | 477 MB | — | ~25MB |
| LuxTTS | cuda | 391ms | 215ms | 13.18× | 24.98× | 2.39 GB | 987 MB | — |
| OmniVoice | cuda | 1.31s | 759ms | 4.95× | 8.35× | 1.99 GB | 2.11 GB | ~1B |
| Kokoro | cpu | 1.37s | 1.22s | 5.80× | 6.96× | 1.65 GB | — | 82M |
| Supertonic | cpu | 1.28s | 1.28s | 5.87× | 5.89× | 612 MB | — | 99M |
| Coqui XTTS-v2 | cuda | 1.95s | 1.76s | 4.19× | 4.88× | 2.08 GB | 2.12 GB | 750M |
| Soprano 80M | cuda | 1.37s | 1.39s | 4.87× | 4.86× | 1.85 GB | 325 MB | 80M |
| Chatterbox Turbo | cuda | 1.88s | 1.54s | 3.68× | 4.66× | 2.12 GB | 3.01 GB | 744M |
| Pocket-TTS | cpu | 163ms | 148ms | 4.07× | 4.05× | 1.75 GB | — | 100M |
| KittenTTS | cpu | 2.39s | 2.07s | 3.29× | 3.72× | 330 MB | — | <100M |
| Qwen3-TTS 1.7B (CUDA-graph) | cuda | 9.45s | 2.01s | 0.70× | 3.04× | 2.38 GB | 4.89 GB | 1.7B |
| F5-TTS | cuda | 2.01s | 1.43s | 2.22× | 3.03× | 2.27 GB | 802 MB | 330M |
| Soprano 80M | cpu | 2.17s | 2.23s | 3.13× | 3.01× | 1.33 GB | — | 80M |
| NeuTTS Nano | cuda | 859ms | 311ms | 2.26× | 3.00× | 3.45 GB | 3.25 GB | 748M |
| VibeVoice Realtime 0.5B | cuda | 3.60s | 3.28s | 2.45× | 2.76× | 1.78 GB | 2.62 GB | 0.5B |
| Chatterbox | cuda | 3.10s | 2.46s | 1.92× | 2.28× | 2.07 GB | 3.26 GB | 1.2B |
| VoxCPM2 2B | cuda | 3.27s | 3.44s | 2.11× | 2.10× | 3.39 GB | 5.56 GB | 2B |
| MOSS-TTS-Nano | cuda | 4.95s | 3.56s | 1.67× | 2.08× | 1.93 GB | 971 MB | 100M |
| NeuTTS Nano | cpu | 925ms | 389ms | 1.67× | 2.03× | 5.61 GB | — | 748M |
| Magpie-TTS | cuda | 5.77s | 4.40s | 1.57× | 2.02× | 2.68 GB | 6.56 GB | 357M |
| VibeVoice 1.5B | cuda | 4.17s | 4.69s | 1.53× | 1.83× | 1.97 GB | 5.26 GB | 3B |
| LuxTTS | cpu | 1.94s | 1.90s | 1.68× | 1.71× | 2.30 GB | — | — |
| NeuTTS Air | cuda | 1.59s | 478ms | 1.29× | 1.68× | 3.88 GB | 3.25 GB | 748M |
| NeuTTS Air | cpu | 1.60s | 554ms | 1.11× | 1.35× | 6.01 GB | — | 748M |
| IndexTTS-2 | cpu | 6.72s | 5.67s | 0.97× | 1.20× | 5.67 GB | — | 1.5B |
| MOSS-TTS-Nano | cpu | 6.91s | 6.09s | 1.07× | 1.17× | 1.30 GB | — | 100M |
| IndexTTS-2 | cuda | 7.25s | 5.95s | 0.91× | 1.08× | 2.77 GB | 7.57 GB | 1.5B |
| Qwen3-TTS 1.7B | cuda | 8.19s | 7.01s | 0.74× | 0.88× | 2.21 GB | 4.64 GB | 1.7B |
| Coqui XTTS-v2 | cpu | 11.98s | 11.11s | 0.79× | 0.80× | 3.15 GB | — | 750M |
| Sesame CSM-1B | cuda | 9.30s | 8.66s | 0.71× | 0.76× | 2.08 GB | 3.51 GB | 1B |
| Chatterbox Turbo | cpu | 11.03s | 10.58s | 0.66× | 0.69× | 3.87 GB | — | 744M |
| Dia 1.6B | cuda | 16.31s | 13.83s | 0.49× | 0.61× | 2.19 GB | 4.58 GB | 1.6B |
| VibeVoice Realtime 0.5B | cpu | 17.64s | 15.86s | 0.57× | 0.55× | 6.51 GB | — | 0.5B |
| ZipVoice 123M (4/5 ok) | cpu | 15.01s | 12.19s | 0.43× | 0.53× | 53.97 GB | — | 123M |
| OmniVoice | cpu | 15.83s | 15.35s | 0.40× | 0.41× | 2.99 GB | — | ~1B |
| Chatterbox | cpu | 17.21s | 16.53s | 0.36× | 0.37× | 4.16 GB | — | 1.2B |
| Magpie-TTS | cpu | 40.47s | 39.78s | 0.25× | 0.26× | 13.55 GB | — | 357M |
| VoxCPM2 2B | cpu | 29.43s | 29.16s | 0.24× | 0.24× | 6.44 GB | — | 2B |
| VibeVoice 1.5B | cpu | 43.47s | 52.25s | 0.17× | 0.17× | 11.26 GB | — | 3B |
| Qwen3-TTS 1.7B | cpu | 38.52s | 35.33s | 0.15× | 0.17× | 9.12 GB | — | 1.7B |
| Sesame CSM-1B | cpu | 52.98s | 57.59s | 0.12× | 0.12× | 5.95 GB | — | 1B |
| Mars5-TTS | cuda | 60.14s | 58.72s | 0.13× | 0.12× | 2.60 GB | 6.80 GB | 1.2B |
| Mars5-TTS | cpu | 59.06s | 58.01s | 0.12× | 0.12× | 2.64 GB | — | 1.2B |
| F5-TTS | cpu | 53.94s | 55.46s | 0.08× | 0.08× | 2.58 GB | — | 330M |
| ZipVoice 123M | cuda | Skipped — out of GPU memory (model exceeds this GPU's VRAM) | ||||||