| Model | Device | TTFA cold | TTFA warm | RTF cold | RTF warm | Peak RAM | Peak VRAM | Size |
|---|---|---|---|---|---|---|---|---|
| Kokoro | cuda | 895ms | 67ms | 8.32Γ | 103.68Γ | 2.43 GB | 925 MB | 82M |
| MeloTTS | cuda | 1.58s | 111ms | 5.08Γ | 67.28Γ | 2.89 GB | 1.16 GB | ~52M |
| Piper | cpu | 160ms | 107ms | 38.94Γ | 58.83Γ | 470 MB | β | ~25MB |
| StyleTTS 2 | cpu | 1.74s | 251ms | 4.93Γ | 33.76Γ | 2.61 GB | β | ~148M |
| StyleTTS 2 | cuda | 1.76s | 265ms | 4.80Γ | 32.15Γ | 2.60 GB | 1.49 GB | ~148M |
| OpenVoice v2 | cuda | 1.85s | 361ms | 3.65Γ | 16.93Γ | 2.65 GB | 1.35 GB | ~100M |
| Kokoro | cpu | 609ms | 532ms | 12.13Γ | 14.38Γ | 1.82 GB | β | 82M |
| Supertonic 3 | cpu | 744ms | 741ms | 9.86Γ | 9.92Γ | 570 MB | β | 99M |
| OmniVoice | cuda | 1.12s | 757ms | 6.44Γ | 9.30Γ | 2.07 GB | 2.16 GB | ~1B |
| MeloTTS | cpu | 1.96s | 876ms | 3.87Γ | 9.27Γ | 2.48 GB | β | ~52M |
| KittenTTS Nano 0.1 | cpu | 1.22s | 1.20s | 6.40Γ | 6.39Γ | 338 MB | β | <100M |
| F5-TTS v1 | cuda | 1.31s | 845ms | 3.47Γ | 5.32Γ | 2.67 GB | 802 MB | 330M |
| Coqui XTTS-v2 | cuda | 2.05s | 1.87s | 3.63Γ | 4.75Γ | 2.10 GB | 2.14 GB | 750M |
| Chatterbox Turbo | cuda | 2.39s | 1.62s | 2.80Γ | 4.28Γ | 2.44 GB | 3.01 GB | 744M |
| OpenVoice v2 | cpu | 2.83s | 1.48s | 2.10Γ | 4.10Γ | 2.78 GB | β | ~100M |
| Pocket-TTS | cpu | 147ms | 123ms | 3.99Γ | 4.06Γ | 1.95 GB | β | 100M |
| Soprano 1.1 80M | cuda | 1.74s | 1.77s | 3.77Γ | 3.76Γ | 2.12 GB | 326 MB | 80M |
| Qwen3-TTS 1.7B (CUDA-graph) | cuda | 6.51s | 1.60s | 0.90Γ | 3.76Γ | 2.48 GB | 4.89 GB | 1.7B |
| Echo-TTS | cuda | 2.83s | 2.15s | 2.63Γ | 3.44Γ | 1.94 GB | 9.38 GB | 2.8B |
| Soprano 1.1 80M | cpu | 1.97s | 2.00s | 3.40Γ | 3.40Γ | 1.34 GB | β | 80M |
| NeuTTS Nano | cuda | 678ms | 258ms | 2.19Γ | 2.76Γ | 3.24 GB | 3.26 GB | 229M |
| DramaBox | cuda | 5.14s | 3.78s | 1.93Γ | 2.58Γ | 2.36 GB | 17.39 GB | 3.3B |
| VibeVoice Realtime 0.5B | cuda | 3.80s | 3.77s | 2.24Γ | 2.39Γ | 1.88 GB | 2.62 GB | 0.5B |
| Chatterbox | cuda | 3.35s | 2.61s | 1.66Γ | 2.24Γ | 2.80 GB | 3.24 GB | 1.2B |
| NeuTTS Nano | cpu | 698ms | 303ms | 1.73Γ | 2.00Γ | 5.03 GB | β | 229M |
| Magpie-TTS | cuda | 5.48s | 4.48s | 1.53Γ | 1.93Γ | 3.54 GB | 5.60 GB | 357M |
| NeuTTS Air | cuda | 1.15s | 417ms | 1.34Γ | 1.62Γ | 3.60 GB | 3.26 GB | 748M |
| MOSS-TTS-Nano | cuda | 5.83s | 4.92s | 1.36Γ | 1.61Γ | 2.42 GB | 777 MB | 100M |
| VibeVoice 1.5B | cuda | 4.99s | 5.32s | 1.44Γ | 1.61Γ | 2.04 GB | 5.26 GB | 1.5B |
| VibeVoice 7B | cuda | 5.67s | 6.17s | 1.39Γ | 1.55Γ | 2.05 GB | 17.63 GB | 7B |
| MOSS-TTS v1.0 | cuda | 5.23s | 4.74s | 1.34Γ | 1.48Γ | 2.08 GB | 22.83 GB | 8B |
| VoxCPM2 2B | cuda | 5.36s | 5.19s | 1.35Γ | 1.35Γ | 6.18 GB | 5.65 GB | 2B |
| IndexTTS-2 | cpu | 6.48s | 5.42s | 1.09Γ | 1.31Γ | 5.56 GB | β | 1.5B |
| NeuTTS Air | cpu | 1.20s | 471ms | 1.16Γ | 1.29Γ | 5.37 GB | β | 748M |
| MOSS-TTS-Nano | cpu | 6.38s | 5.64s | 1.08Γ | 1.21Γ | 3.14 GB | β | 100M |
| IndexTTS-2 | cuda | 7.20s | 6.03s | 0.93Γ | 1.11Γ | 5.88 GB | 7.60 GB | 1.5B |
| Parler-TTS Mini v1 | cuda | 8.29s | 8.14s | 0.96Γ | 1.05Γ | 3.19 GB | 2.63 GB | 878M |
| Fish Speech 1.5 | cuda | 8.53s | 7.84s | 0.81Γ | 0.94Γ | 3.83 GB | 1.80 GB | ~500M |
| Coqui XTTS-v2 | cpu | 9.92s | 9.73s | 0.87Γ | 0.88Γ | 3.23 GB | β | 750M |
| Chatterbox Turbo | cpu | 10.65s | 9.92s | 0.70Γ | 0.73Γ | 4.16 GB | β | 744M |
| Zonos v0.1 | cuda | 10.88s | 10.38s | 0.66Γ | 0.71Γ | 6.51 GB | 4.48 GB | 1.6B |
| Qwen3-TTS 1.7B Base | cuda | 10.77s | 8.95s | 0.55Γ | 0.70Γ | 2.42 GB | 4.64 GB | 1.7B |
| VibeVoice Realtime 0.5B | cpu | 14.49s | 13.35s | 0.63Γ | 0.66Γ | 5.89 GB | β | 0.5B |
| ZipVoice 123M (4/5 ok) | cuda | 68.83s | 86.51s | 0.35Γ | 0.60Γ | 25.87 GB | 53.16 GB | 123M |
| Maya1 | cuda | 16.57s | 14.80s | 0.53Γ | 0.59Γ | 2.42 GB | 6.72 GB | 3B |
| Dia 1.6B-0626 | cuda | 25.42s | 22.82s | 0.46Γ | 0.55Γ | 4.42 GB | 6.32 GB | 1.6B |
| Sesame CSM-1B | cuda | 12.01s | 12.45s | 0.52Γ | 0.54Γ | 2.38 GB | 3.51 GB | 1B |
| ZipVoice 123M (3/5 ok) | cpu | 22.96s | 13.44s | 0.27Γ | 0.45Γ | 35.45 GB | β | 123M |
| VoxCPM2 2B | cpu | 15.35s | 14.85s | 0.46Γ | 0.45Γ | 10.14 GB | β | 2B |
| Chatterbox | cpu | 14.47s | 14.10s | 0.40Γ | 0.43Γ | 4.24 GB | β | 1.2B |
| OmniVoice | cpu | 16.45s | 15.98s | 0.38Γ | 0.39Γ | 3.05 GB | β | ~1B |
| OuteTTS 1.0 1B | cuda | 25.01s | 24.27s | 0.32Γ | 0.33Γ | 2.59 GB | 3.67 GB | 1B |
| Magpie-TTS | cpu | 37.17s | 33.72s | 0.29Γ | 0.30Γ | 6.10 GB | β | 357M |
| Mars5-TTS | cpu | 31.19s | 30.69s | 0.22Γ | 0.24Γ | 4.03 GB | β | 1.2B |
| Mars5-TTS | cuda | 31.20s | 31.06s | 0.23Γ | 0.23Γ | 2.25 GB | 6.81 GB | 1.2B |
| VibeVoice 1.5B | cpu | 39.59s | 45.31s | 0.19Γ | 0.20Γ | 11.62 GB | β | 1.5B |
| Qwen3-TTS 1.7B Base | cpu | 34.81s | 30.07s | 0.18Γ | 0.19Γ | 10.40 GB | β | 1.7B |
| Fish Speech 1.5 | cpu | 45.73s | 45.34s | 0.17Γ | 0.17Γ | 4.46 GB | β | ~500M |
| Parler-TTS Mini v1 | cpu | 63.39s | 62.77s | 0.14Γ | 0.14Γ | 4.31 GB | β | 878M |
| Zonos v0.1 | cpu | 62.12s | 60.57s | 0.12Γ | 0.12Γ | 7.39 GB | β | 1.6B |
| Sesame CSM-1B | cpu | 50.90s | 57.54s | 0.11Γ | 0.12Γ | 5.77 GB | β | 1B |
| F5-TTS v1 | cpu | 58.77s | 60.21s | 0.07Γ | 0.07Γ | 2.59 GB | β | 330M |
| Maya1 (2/4 ok) | cpu | 66.31s | 73.66s | 0.07Γ | 0.07Γ | 7.33 GB | β | 3B |
| LuxTTS | cpu | LuxTTS install failed (piper-phonemize has no Windows wheels) | ||||||