5 prompt(s) · one section per prompt · all models ranked by warm TTFA (fastest first) within each
Each prompt section shows every model's audio output, ordered by warm TTFA (fastest first). Click any audio player to hear that model's rendering.
Prompt 1
[en]"Open the browser and read my email."
Rank
Model
Device
TTFA warm
Audio
1
Kokoro
cuda
47ms
2
Piper
cpu
47ms
3
Pocket-TTS
cpu
66ms
4
Kokoro
cpu
226ms
5
NeuTTS Nano
cuda
270ms
6
NeuTTS Nano
cpu
303ms
7
NeuTTS Air
cuda
411ms
8
Supertonic
cpu
438ms
9
NeuTTS Air
cpu
463ms
10
KittenTTS
cpu
504ms
11
Qwen3-TTS 1.7B (CUDA-graph)
cuda
557ms
12
Soprano 80M
cuda
557ms
13
Coqui XTTS-v2
cuda
614ms
14
Soprano 80M
cpu
661ms
15
Chatterbox Turbo
cuda
725ms
16
F5-TTS
cuda
778ms
17
OmniVoice
cuda
887ms
18
ZipVoice 123M
cuda
1.20s
19
Chatterbox
cuda
1.22s
20
VibeVoice Realtime 0.5B
cuda
1.35s
21
Magpie-TTS
cuda
1.61s
22
VibeVoice 1.5B
cuda
1.85s
23
IndexTTS-2
cpu
2.04s
24
VoxCPM2 2B
cuda
2.48s
25
MOSS-TTS-Nano
cuda
2.57s
26
MOSS-TTS
cuda
2.72s
27
IndexTTS-2
cuda
2.74s
28
Coqui XTTS-v2
cpu
2.75s
29
Qwen3-TTS 1.7B
cuda
2.78s
30
MOSS-TTS-Nano
cpu
2.99s
31
Chatterbox Turbo
cpu
3.62s
32
Sesame CSM-1B
cuda
3.73s
33
VibeVoice Realtime 0.5B
cpu
4.74s
34
Dia 1.6B
cuda
4.75s
35
Chatterbox
cpu
5.32s
36
Magpie-TTS
cpu
7.12s
37
OmniVoice
cpu
7.50s
38
VoxCPM2 2B
cpu
8.44s
39
ZipVoice 123M
cpu
10.63s
40
Qwen3-TTS 1.7B
cpu
11.69s
41
Mars5-TTS
cpu
19.20s
42
Mars5-TTS
cuda
19.75s
43
VibeVoice 1.5B
cpu
20.22s
44
Sesame CSM-1B
cpu
20.89s
45
F5-TTS
cpu
45.35s
Prompt 2
[en]"I'll start a new git branch, push the changes, and open a pull request when the tests pass."
Rank
Model
Device
TTFA warm
Audio
1
Kokoro
cuda
59ms
2
Piper
cpu
89ms
3
Pocket-TTS
cpu
94ms
4
NeuTTS Nano
cuda
268ms
5
NeuTTS Nano
cpu
317ms
6
Kokoro
cpu
380ms
7
NeuTTS Air
cuda
420ms
8
NeuTTS Air
cpu
506ms
9
Supertonic
cpu
586ms
10
F5-TTS
cuda
735ms
11
OmniVoice
cuda
773ms
12
KittenTTS
cpu
843ms
13
Coqui XTTS-v2
cuda
1.15s
14
Qwen3-TTS 1.7B (CUDA-graph)
cuda
1.30s
15
Chatterbox Turbo
cuda
1.30s
16
Soprano 80M
cuda
1.33s
17
Soprano 80M
cpu
1.43s
18
Chatterbox
cuda
2.24s
19
VibeVoice Realtime 0.5B
cuda
2.64s
20
Magpie-TTS
cuda
3.45s
21
VibeVoice 1.5B
cuda
3.47s
22
VoxCPM2 2B
cuda
3.59s
23
MOSS-TTS
cuda
3.94s
24
IndexTTS-2
cuda
4.16s
25
IndexTTS-2
cpu
4.23s
26
MOSS-TTS-Nano
cpu
4.47s
27
MOSS-TTS-Nano
cuda
5.23s
28
Qwen3-TTS 1.7B
cuda
6.47s
29
Coqui XTTS-v2
cpu
6.65s
30
Chatterbox Turbo
cpu
8.06s
31
VibeVoice Realtime 0.5B
cpu
9.36s
32
Chatterbox
cpu
10.32s
33
Sesame CSM-1B
cuda
10.43s
34
VoxCPM2 2B
cpu
11.56s
35
OmniVoice
cpu
13.09s
36
Magpie-TTS
cpu
21.02s
37
Qwen3-TTS 1.7B
cpu
22.87s
38
Mars5-TTS
cuda
25.09s
39
Mars5-TTS
cpu
26.10s
40
VibeVoice 1.5B
cpu
29.51s
41
Dia 1.6B
cuda
51.96s
42
F5-TTS
cpu
58.82s
43
Sesame CSM-1B
cpu
66.45s
44
ZipVoice 123M
cuda
137.13s
Prompt 3
[en]"The Parakeet TDT zero point six billion parameter model achieves one point six nine percent word error rate on LibriSpeech test-clean, beating Whisper Large V3 at two point seven percent while running at over two thousand times realtime on a single GPU."
Rank
Model
Device
TTFA warm
Audio
1
Pocket-TTS
cpu
109ms
2
Kokoro
cuda
116ms
3
Piper
cpu
239ms
4
NeuTTS Nano
cuda
265ms
5
NeuTTS Nano
cpu
322ms
6
NeuTTS Air
cuda
423ms
7
NeuTTS Air
cpu
458ms
8
OmniVoice
cuda
714ms
9
F5-TTS
cuda
1.03s
10
Kokoro
cpu
1.27s
11
Supertonic
cpu
1.45s
12
KittenTTS
cpu
2.46s
13
Chatterbox Turbo
cuda
3.12s
14
Qwen3-TTS 1.7B (CUDA-graph)
cuda
3.30s
15
Soprano 80M
cuda
3.70s
16
Soprano 80M
cpu
4.27s
17
Chatterbox
cuda
4.68s
18
Coqui XTTS-v2
cuda
4.82s
19
VibeVoice Realtime 0.5B
cuda
7.86s
20
MOSS-TTS
cuda
8.40s
21
Magpie-TTS
cuda
9.64s
22
MOSS-TTS-Nano
cuda
10.45s
23
VibeVoice 1.5B
cuda
10.63s
24
VoxCPM2 2B
cuda
11.06s
25
IndexTTS-2
cpu
11.13s
26
MOSS-TTS-Nano
cpu
11.83s
27
IndexTTS-2
cuda
12.58s
28
Sesame CSM-1B
cuda
18.23s
29
Chatterbox Turbo
cpu
19.26s
30
Qwen3-TTS 1.7B
cuda
19.42s
31
Dia 1.6B
cuda
22.29s
32
Coqui XTTS-v2
cpu
25.28s
33
VibeVoice Realtime 0.5B
cpu
26.34s
34
Chatterbox
cpu
28.27s
35
VoxCPM2 2B
cpu
32.23s
36
OmniVoice
cpu
34.51s
37
Mars5-TTS
cpu
50.94s
38
Mars5-TTS
cuda
51.19s
39
Qwen3-TTS 1.7B
cpu
63.06s
40
F5-TTS
cpu
77.80s
41
Sesame CSM-1B
cpu
78.74s
42
VibeVoice 1.5B
cpu
85.74s
43
Magpie-TTS
cpu
87.47s
Prompt 4
[en]"Run pytest tests slash test underscore voice dot py with verbose flag and capture flag set to no."
Rank
Model
Device
TTFA warm
Audio
1
Kokoro
cuda
65ms
2
Pocket-TTS
cpu
92ms
3
Piper
cpu
93ms
4
NeuTTS Nano
cuda
258ms
5
NeuTTS Nano
cpu
297ms
6
NeuTTS Air
cuda
417ms
7
Kokoro
cpu
452ms
8
NeuTTS Air
cpu
456ms
9
Supertonic
cpu
663ms
10
OmniVoice
cuda
694ms
11
F5-TTS
cuda
836ms
12
KittenTTS
cpu
1.01s
13
Chatterbox Turbo
cuda
1.33s
14
Soprano 80M
cuda
1.49s
15
Soprano 80M
cpu
1.62s
16
Qwen3-TTS 1.7B (CUDA-graph)
cuda
1.66s
17
Coqui XTTS-v2
cuda
1.93s
18
Chatterbox
cuda
2.31s
19
VibeVoice Realtime 0.5B
cuda
3.23s
20
MOSS-TTS-Nano
cuda
3.67s
21
IndexTTS-2
cpu
4.29s
22
Magpie-TTS
cuda
4.37s
23
IndexTTS-2
cuda
4.64s
24
MOSS-TTS
cuda
4.90s
25
MOSS-TTS-Nano
cpu
5.00s
26
VibeVoice 1.5B
cuda
5.30s
27
VoxCPM2 2B
cuda
6.05s
28
Chatterbox Turbo
cpu
8.76s
29
Coqui XTTS-v2
cpu
9.48s
30
Qwen3-TTS 1.7B
cuda
9.49s
31
Dia 1.6B
cuda
12.27s
32
Chatterbox
cpu
12.46s
33
VibeVoice Realtime 0.5B
cpu
12.96s
34
OmniVoice
cpu
13.20s
35
VoxCPM2 2B
cpu
13.91s
36
ZipVoice 123M
cpu
16.26s
37
Sesame CSM-1B
cuda
17.41s
38
Mars5-TTS
cpu
26.53s
39
Mars5-TTS
cuda
28.21s
40
Magpie-TTS
cpu
29.11s
41
Qwen3-TTS 1.7B
cpu
30.68s
42
VibeVoice 1.5B
cpu
45.79s
43
F5-TTS
cpu
58.87s
44
Sesame CSM-1B
cpu
64.06s
45
ZipVoice 123M
cuda
142.94s
Prompt 5
[fr]"Bonjour, je m'appelle Cicero et je vais vous aider avec votre code aujourd'hui."