5 prompt(s) · one section per prompt · all models ranked by warm TTFA (fastest first) within each
Each prompt section shows every model's audio output, ordered by warm TTFA (fastest first). Click any audio player to hear that model's rendering.
Reference voice
Each model below was given this clip + transcript as the voice to imitate. Source: chris_hemsworth_15s.wav
Prompt 1
[en]"Open the browser and read my email."
Rank
Model
Device
TTFA warm
Audio
1
Pocket-TTS
cpu
63ms
2
NeuTTS Nano
cuda
290ms
3
NeuTTS Nano
cpu
318ms
4
NeuTTS Air
cuda
428ms
5
NeuTTS Air
cpu
475ms
6
Qwen3-TTS 1.7B (CUDA-graph)
cuda
542ms
7
Coqui XTTS-v2
cuda
563ms
8
Chatterbox Turbo
cuda
632ms
9
F5-TTS
cuda
728ms
10
OmniVoice
cuda
736ms
11
ZipVoice 123M
cuda
1.08s
12
Chatterbox
cuda
1.25s
13
Dia 1.6B
cuda
1.30s
14
VoxCPM2 2B
cuda
1.81s
15
IndexTTS-2
cpu
2.09s
16
MOSS-TTS-Nano
cuda
2.28s
17
Coqui XTTS-v2
cpu
2.53s
18
IndexTTS-2
cuda
2.57s
19
MOSS-TTS-Nano
cpu
2.61s
20
Qwen3-TTS 1.7B
cuda
2.79s
21
MOSS-TTS
cuda
2.99s
22
Chatterbox Turbo
cpu
3.48s
23
Sesame CSM-1B
cuda
3.97s
24
VoxCPM2 2B
cpu
6.39s
25
Chatterbox
cpu
6.52s
26
ZipVoice 123M
cpu
7.67s
27
Qwen3-TTS 1.7B
cpu
11.80s
28
Sesame CSM-1B
cpu
17.17s
29
Mars5-TTS
cuda
29.79s
30
Mars5-TTS
cpu
30.56s
31
OmniVoice
cpu
37.89s
32
F5-TTS
cpu
50.19s
Prompt 2
[en]"I'll start a new git branch, push the changes, and open a pull request when the tests pass."
Rank
Model
Device
TTFA warm
Audio
1
Pocket-TTS
cpu
86ms
2
NeuTTS Nano
cuda
294ms
3
NeuTTS Nano
cpu
333ms
4
NeuTTS Air
cuda
456ms
5
NeuTTS Air
cpu
480ms
6
OmniVoice
cuda
759ms
7
F5-TTS
cuda
914ms
8
Coqui XTTS-v2
cuda
1.09s
9
Chatterbox Turbo
cuda
1.13s
10
Chatterbox
cuda
2.21s
11
MOSS-TTS-Nano
cuda
3.38s
12
VoxCPM2 2B
cuda
3.43s
13
MOSS-TTS
cuda
3.52s
14
MOSS-TTS-Nano
cpu
3.97s
15
IndexTTS-2
cpu
4.01s
16
IndexTTS-2
cuda
4.12s
17
Coqui XTTS-v2
cpu
6.01s
18
Chatterbox Turbo
cpu
6.28s
19
Dia 1.6B
cuda
8.34s
20
Sesame CSM-1B
cuda
9.01s
21
Chatterbox
cpu
11.99s
22
ZipVoice 123M
cpu
13.40s
23
VoxCPM2 2B
cpu
14.44s
24
Mars5-TTS
cuda
33.89s
25
Qwen3-TTS 1.7B (CUDA-graph)
cuda
36.45s
26
Sesame CSM-1B
cpu
38.15s
27
Mars5-TTS
cpu
43.83s
28
OmniVoice
cpu
48.65s
29
F5-TTS
cpu
64.77s
30
ZipVoice 123M
cuda
133.84s
Prompt 3
[en]"The Parakeet TDT zero point six billion parameter model achieves one point six nine percent word error rate on LibriSpeech test-clean, beating Whisper Large V3 at two point seven percent while running at over two thousand times realtime on a single GPU."
Rank
Model
Device
TTFA warm
Audio
1
Pocket-TTS
cpu
105ms
2
NeuTTS Nano
cuda
286ms
3
NeuTTS Nano
cpu
337ms
4
NeuTTS Air
cuda
432ms
5
NeuTTS Air
cpu
478ms
6
OmniVoice
cuda
934ms
7
F5-TTS
cuda
1.71s
8
Chatterbox Turbo
cuda
3.27s
9
Coqui XTTS-v2
cuda
3.35s
10
Chatterbox
cuda
5.20s
11
MOSS-TTS
cuda
8.96s
12
IndexTTS-2
cpu
9.05s
13
MOSS-TTS-Nano
cuda
9.17s
14
VoxCPM2 2B
cuda
10.29s
15
MOSS-TTS-Nano
cpu
10.40s
16
IndexTTS-2
cuda
10.75s
17
Sesame CSM-1B
cuda
17.44s
18
Chatterbox Turbo
cpu
20.05s
19
Coqui XTTS-v2
cpu
20.33s
20
Dia 1.6B
cuda
24.69s
21
Chatterbox
cpu
34.74s
22
VoxCPM2 2B
cpu
35.11s
23
Qwen3-TTS 1.7B (CUDA-graph)
cuda
36.41s
24
Mars5-TTS
cpu
54.81s
25
Mars5-TTS
cuda
55.86s
26
Sesame CSM-1B
cpu
77.75s
27
OmniVoice
cpu
78.95s
28
F5-TTS
cpu
103.89s
29
ZipVoice 123M
cpu
183.07s
Prompt 4
[en]"Run pytest tests slash test underscore voice dot py with verbose flag and capture flag set to no."
Rank
Model
Device
TTFA warm
Audio
1
Pocket-TTS
cpu
95ms
2
NeuTTS Nano
cuda
294ms
3
NeuTTS Nano
cpu
339ms
4
NeuTTS Air
cuda
436ms
5
NeuTTS Air
cpu
462ms
6
OmniVoice
cuda
752ms
7
F5-TTS
cuda
870ms
8
Chatterbox Turbo
cuda
1.32s
9
Coqui XTTS-v2
cuda
1.59s
10
Chatterbox
cuda
2.87s
11
MOSS-TTS-Nano
cuda
3.69s
12
IndexTTS-2
cpu
4.32s
13
MOSS-TTS-Nano
cpu
4.58s
14
MOSS-TTS
cuda
4.69s
15
IndexTTS-2
cuda
5.19s
16
VoxCPM2 2B
cuda
5.19s
17
Chatterbox Turbo
cpu
7.65s
18
Coqui XTTS-v2
cpu
9.36s
19
Dia 1.6B
cuda
11.47s
20
VoxCPM2 2B
cpu
14.38s
21
Sesame CSM-1B
cuda
15.42s
22
Chatterbox
cpu
16.32s
23
ZipVoice 123M
cpu
16.61s
24
Qwen3-TTS 1.7B (CUDA-graph)
cuda
36.45s
25
Mars5-TTS
cuda
38.59s
26
Mars5-TTS
cpu
43.71s
27
OmniVoice
cpu
49.71s
28
Sesame CSM-1B
cpu
61.47s
29
F5-TTS
cpu
65.87s
30
ZipVoice 123M
cuda
142.27s
Prompt 5
[fr]"Bonjour, je m'appelle Cicero et je vais vous aider avec votre code aujourd'hui."