click any column header to sort

TTS Bench — Speed — windows-cloning

Rig: windows-5090 — AMD Ryzen 9 9950X3D 16-Core Processor (16C) · NVIDIA GeForce RTX 5090 32GB · 126 GB RAM · Windows 11
Label: cloning · chris_hemsworth_15s — ref chris_hemsworth_15s.wav
TTFA = time to first audio (ms; lower is better). RTF = real-time factor (× realtime; higher is better; e.g. 10× means 10 sec of audio generated per 1 sec of compute). Cold = first run after process start; warm = subsequent runs.

Speed winners

Fastest (this cloning run): OmniVoice (cuda) — 9.75× warm RTF, 803ms warm TTFA

Model Device TTFA cold TTFA warm RTF cold RTF warm Peak RAM Peak VRAM Size
OmniVoicecuda 1.20s 803ms 6.67× 9.75× 2.27 GB 2.42 GB ~1B
F5-TTScuda 1.56s 1.06s 3.75× 5.50× 2.67 GB 883 MB 330M
Coqui XTTS-v2cuda 2.02s 1.51s 3.02× 4.76× 2.35 GB 2.10 GB 750M
Chatterbox Turbocuda 2.66s 1.59s 2.43× 4.15× 2.69 GB 3.33 GB 744M
Pocket-TTScpu 136ms 124ms 4.11× 4.10× 2.29 GB 100M
Qwen3-TTS 1.7B (CUDA-graph)cuda 34.05s 25.83s 2.94× 3.89× 2.52 GB 4.95 GB 1.7B
NeuTTS Nanocuda 738ms 305ms 1.79× 2.44× 3.22 GB 3.26 GB 748M
Chatterboxcuda 4.01s 2.88s 1.63× 2.33× 3.02 GB 3.63 GB 1.2B
NeuTTS Nanocpu 754ms 338ms 1.47× 1.90× 5.11 GB 748M
MOSS-TTS-Nanocuda 5.59s 4.87s 1.34× 1.66× 2.42 GB 910 MB 100M
NeuTTS Aircuda 1.16s 438ms 1.20× 1.64× 3.57 GB 3.26 GB 748M
IndexTTS-2cpu 5.77s 4.87s 1.20× 1.47× 5.54 GB 1.5B
MOSS-TTScuda 5.44s 4.73s 1.31× 1.44× 2.10 GB 22.83 GB 8B
NeuTTS Aircpu 1.22s 474ms 1.11× 1.34× 5.47 GB 748M
VoxCPM2 2Bcuda 6.36s 4.72s 0.96× 1.33× 2.49 GB 5.71 GB 2B
IndexTTS-2cuda 6.80s 5.66s 1.03× 1.26× 4.05 GB 7.71 GB 1.5B
MOSS-TTS-Nanocpu 5.69s 5.04s 1.16× 1.25× 3.05 GB 100M
Coqui XTTS-v2cpu 9.93s 8.76s 0.84× 0.85× 3.43 GB 750M
Chatterbox Turbocpu 10.84s 9.37s 0.68× 0.71× 4.39 GB 744M
Qwen3-TTS 1.7B (1/5 ok)cuda 4.37s 2.79s 0.49× 0.69× 2.42 GB 4.45 GB 1.7B
ZipVoice 123M (4/5 ok)cuda 68.19s 105.99s 0.38× 0.60× 25.56 GB 53.24 GB 123M
Sesame CSM-1Bcuda 13.21s 11.46s 0.47× 0.55× 2.47 GB 3.54 GB 1B
Dia 1.6Bcuda 13.79s 11.45s 0.37× 0.50× 4.42 GB 4.82 GB 1.6B
ZipVoice 123Mcpu 50.03s 47.06s 0.30× 0.43× 39.66 GB 123M
VoxCPM2 2Bcpu 18.12s 16.04s 0.34× 0.39× 7.53 GB 2B
Chatterboxcpu 17.39s 17.39s 0.36× 0.38× 4.72 GB 1.2B
Qwen3-TTS 1.7B (2/5 ok)cpu 18.23s 18.32s 0.14× 0.16× 9.21 GB 1.7B
OmniVoicecpu 54.74s 53.28s 0.14× 0.14× 3.98 GB ~1B
Sesame CSM-1Bcpu 48.90s 48.64s 0.12× 0.12× 6.32 GB 1B
Mars5-TTScpu 41.89s 43.23s 0.08× 0.10× 2.24 GB 1.2B
F5-TTScpu 71.35s 71.18s 0.08× 0.08× 2.71 GB 330M
Mars5-TTScuda 41.12s 39.53s 0.09× 0.07× 2.23 GB 7.23 GB 1.2B
LuxTTScpuLuxTTS install failed (piper-phonemize has no Windows wheels)