click any column header to sort

TTS Bench — Samples — windows-default

Rig: windows-5090 — AMD Ryzen 9 9950X3D 16-Core Processor (16C) · NVIDIA GeForce RTX 5090 32GB · 126 GB RAM · Windows 11
Label: default voice
5 prompt(s) · one section per prompt · all models ranked by warm TTFA (fastest first) within each
Each prompt section shows every model's audio output, ordered by warm TTFA (fastest first). Click any audio player to hear that model's rendering.

Prompt 1

[en]"Open the browser and read my email."
Rank Model Device TTFA warm Audio
1 Kokoro cuda 47ms
2 Piper cpu 47ms
3 Pocket-TTS cpu 66ms
4 Kokoro cpu 226ms
5 NeuTTS Nano cuda 270ms
6 NeuTTS Nano cpu 303ms
7 NeuTTS Air cuda 411ms
8 Supertonic cpu 438ms
9 NeuTTS Air cpu 463ms
10 KittenTTS cpu 504ms
11 Qwen3-TTS 1.7B (CUDA-graph) cuda 557ms
12 Soprano 80M cuda 557ms
13 Coqui XTTS-v2 cuda 614ms
14 Soprano 80M cpu 661ms
15 Chatterbox Turbo cuda 725ms
16 F5-TTS cuda 778ms
17 OmniVoice cuda 887ms
18 ZipVoice 123M cuda 1.20s
19 Chatterbox cuda 1.22s
20 VibeVoice Realtime 0.5B cuda 1.35s
21 Magpie-TTS cuda 1.61s
22 VibeVoice 1.5B cuda 1.85s
23 IndexTTS-2 cpu 2.04s
24 VoxCPM2 2B cuda 2.48s
25 MOSS-TTS-Nano cuda 2.57s
26 MOSS-TTS cuda 2.72s
27 IndexTTS-2 cuda 2.74s
28 Coqui XTTS-v2 cpu 2.75s
29 Qwen3-TTS 1.7B cuda 2.78s
30 MOSS-TTS-Nano cpu 2.99s
31 Chatterbox Turbo cpu 3.62s
32 Sesame CSM-1B cuda 3.73s
33 VibeVoice Realtime 0.5B cpu 4.74s
34 Dia 1.6B cuda 4.75s
35 Chatterbox cpu 5.32s
36 Magpie-TTS cpu 7.12s
37 OmniVoice cpu 7.50s
38 VoxCPM2 2B cpu 8.44s
39 ZipVoice 123M cpu 10.63s
40 Qwen3-TTS 1.7B cpu 11.69s
41 Mars5-TTS cpu 19.20s
42 Mars5-TTS cuda 19.75s
43 VibeVoice 1.5B cpu 20.22s
44 Sesame CSM-1B cpu 20.89s
45 F5-TTS cpu 45.35s

Prompt 2

[en]"I'll start a new git branch, push the changes, and open a pull request when the tests pass."
Rank Model Device TTFA warm Audio
1 Kokoro cuda 59ms
2 Piper cpu 89ms
3 Pocket-TTS cpu 94ms
4 NeuTTS Nano cuda 268ms
5 NeuTTS Nano cpu 317ms
6 Kokoro cpu 380ms
7 NeuTTS Air cuda 420ms
8 NeuTTS Air cpu 506ms
9 Supertonic cpu 586ms
10 F5-TTS cuda 735ms
11 OmniVoice cuda 773ms
12 KittenTTS cpu 843ms
13 Coqui XTTS-v2 cuda 1.15s
14 Qwen3-TTS 1.7B (CUDA-graph) cuda 1.30s
15 Chatterbox Turbo cuda 1.30s
16 Soprano 80M cuda 1.33s
17 Soprano 80M cpu 1.43s
18 Chatterbox cuda 2.24s
19 VibeVoice Realtime 0.5B cuda 2.64s
20 Magpie-TTS cuda 3.45s
21 VibeVoice 1.5B cuda 3.47s
22 VoxCPM2 2B cuda 3.59s
23 MOSS-TTS cuda 3.94s
24 IndexTTS-2 cuda 4.16s
25 IndexTTS-2 cpu 4.23s
26 MOSS-TTS-Nano cpu 4.47s
27 MOSS-TTS-Nano cuda 5.23s
28 Qwen3-TTS 1.7B cuda 6.47s
29 Coqui XTTS-v2 cpu 6.65s
30 Chatterbox Turbo cpu 8.06s
31 VibeVoice Realtime 0.5B cpu 9.36s
32 Chatterbox cpu 10.32s
33 Sesame CSM-1B cuda 10.43s
34 VoxCPM2 2B cpu 11.56s
35 OmniVoice cpu 13.09s
36 Magpie-TTS cpu 21.02s
37 Qwen3-TTS 1.7B cpu 22.87s
38 Mars5-TTS cuda 25.09s
39 Mars5-TTS cpu 26.10s
40 VibeVoice 1.5B cpu 29.51s
41 Dia 1.6B cuda 51.96s
42 F5-TTS cpu 58.82s
43 Sesame CSM-1B cpu 66.45s
44 ZipVoice 123M cuda 137.13s

Prompt 3

[en]"The Parakeet TDT zero point six billion parameter model achieves one point six nine percent word error rate on LibriSpeech test-clean, beating Whisper Large V3 at two point seven percent while running at over two thousand times realtime on a single GPU."
Rank Model Device TTFA warm Audio
1 Pocket-TTS cpu 109ms
2 Kokoro cuda 116ms
3 Piper cpu 239ms
4 NeuTTS Nano cuda 265ms
5 NeuTTS Nano cpu 322ms
6 NeuTTS Air cuda 423ms
7 NeuTTS Air cpu 458ms
8 OmniVoice cuda 714ms
9 F5-TTS cuda 1.03s
10 Kokoro cpu 1.27s
11 Supertonic cpu 1.45s
12 KittenTTS cpu 2.46s
13 Chatterbox Turbo cuda 3.12s
14 Qwen3-TTS 1.7B (CUDA-graph) cuda 3.30s
15 Soprano 80M cuda 3.70s
16 Soprano 80M cpu 4.27s
17 Chatterbox cuda 4.68s
18 Coqui XTTS-v2 cuda 4.82s
19 VibeVoice Realtime 0.5B cuda 7.86s
20 MOSS-TTS cuda 8.40s
21 Magpie-TTS cuda 9.64s
22 MOSS-TTS-Nano cuda 10.45s
23 VibeVoice 1.5B cuda 10.63s
24 VoxCPM2 2B cuda 11.06s
25 IndexTTS-2 cpu 11.13s
26 MOSS-TTS-Nano cpu 11.83s
27 IndexTTS-2 cuda 12.58s
28 Sesame CSM-1B cuda 18.23s
29 Chatterbox Turbo cpu 19.26s
30 Qwen3-TTS 1.7B cuda 19.42s
31 Dia 1.6B cuda 22.29s
32 Coqui XTTS-v2 cpu 25.28s
33 VibeVoice Realtime 0.5B cpu 26.34s
34 Chatterbox cpu 28.27s
35 VoxCPM2 2B cpu 32.23s
36 OmniVoice cpu 34.51s
37 Mars5-TTS cpu 50.94s
38 Mars5-TTS cuda 51.19s
39 Qwen3-TTS 1.7B cpu 63.06s
40 F5-TTS cpu 77.80s
41 Sesame CSM-1B cpu 78.74s
42 VibeVoice 1.5B cpu 85.74s
43 Magpie-TTS cpu 87.47s

Prompt 4

[en]"Run pytest tests slash test underscore voice dot py with verbose flag and capture flag set to no."
Rank Model Device TTFA warm Audio
1 Kokoro cuda 65ms
2 Pocket-TTS cpu 92ms
3 Piper cpu 93ms
4 NeuTTS Nano cuda 258ms
5 NeuTTS Nano cpu 297ms
6 NeuTTS Air cuda 417ms
7 Kokoro cpu 452ms
8 NeuTTS Air cpu 456ms
9 Supertonic cpu 663ms
10 OmniVoice cuda 694ms
11 F5-TTS cuda 836ms
12 KittenTTS cpu 1.01s
13 Chatterbox Turbo cuda 1.33s
14 Soprano 80M cuda 1.49s
15 Soprano 80M cpu 1.62s
16 Qwen3-TTS 1.7B (CUDA-graph) cuda 1.66s
17 Coqui XTTS-v2 cuda 1.93s
18 Chatterbox cuda 2.31s
19 VibeVoice Realtime 0.5B cuda 3.23s
20 MOSS-TTS-Nano cuda 3.67s
21 IndexTTS-2 cpu 4.29s
22 Magpie-TTS cuda 4.37s
23 IndexTTS-2 cuda 4.64s
24 MOSS-TTS cuda 4.90s
25 MOSS-TTS-Nano cpu 5.00s
26 VibeVoice 1.5B cuda 5.30s
27 VoxCPM2 2B cuda 6.05s
28 Chatterbox Turbo cpu 8.76s
29 Coqui XTTS-v2 cpu 9.48s
30 Qwen3-TTS 1.7B cuda 9.49s
31 Dia 1.6B cuda 12.27s
32 Chatterbox cpu 12.46s
33 VibeVoice Realtime 0.5B cpu 12.96s
34 OmniVoice cpu 13.20s
35 VoxCPM2 2B cpu 13.91s
36 ZipVoice 123M cpu 16.26s
37 Sesame CSM-1B cuda 17.41s
38 Mars5-TTS cpu 26.53s
39 Mars5-TTS cuda 28.21s
40 Magpie-TTS cpu 29.11s
41 Qwen3-TTS 1.7B cpu 30.68s
42 VibeVoice 1.5B cpu 45.79s
43 F5-TTS cpu 58.87s
44 Sesame CSM-1B cpu 64.06s
45 ZipVoice 123M cuda 142.94s

Prompt 5

[fr]"Bonjour, je m'appelle Cicero et je vais vous aider avec votre code aujourd'hui."
Rank Model Device TTFA warm Audio
1 Kokoro cuda 50ms
2 Piper cpu 68ms
3 NeuTTS Nano cuda 230ms
4 Pocket-TTS cpu 253ms
5 NeuTTS Nano cpu 277ms
6 Kokoro cpu 334ms
7 Supertonic cpu 565ms
8 OmniVoice cuda 719ms
9 Coqui XTTS-v2 cuda 859ms
10 Qwen3-TTS 1.7B (CUDA-graph) cuda 1.18s
11 MOSS-TTS-Nano cuda 2.67s
12 VoxCPM2 2B cuda 2.77s
13 Magpie-TTS cuda 3.34s
14 MOSS-TTS cuda 3.74s
15 MOSS-TTS-Nano cpu 3.89s
16 Coqui XTTS-v2 cpu 4.47s
17 Qwen3-TTS 1.7B cuda 6.57s
18 VoxCPM2 2B cpu 8.08s
19 OmniVoice cpu 11.60s
20 ZipVoice 123M cpu 13.44s
21 Qwen3-TTS 1.7B cpu 22.03s
22 Magpie-TTS cpu 23.89s
23 ZipVoice 123M cuda 64.78s