click any column header to sort

TTS Bench — Speed — linux-cloning

Rig: linux-3090 — AMD Ryzen 9 5900XT 16-Core Processor · NVIDIA GeForce RTX 3090 24GB · 63 GB RAM · Linux 6.8.0-117-generic
Label: cloning · chris_hemsworth_15s — ref chris_hemsworth_15s.wav
TTFA = time to first audio (ms; lower is better). RTF = real-time factor (× realtime; higher is better; e.g. 10× means 10 sec of audio generated per 1 sec of compute). Cold = first run after process start; warm = subsequent runs.

Speed winners

Fastest (this cloning run): OmniVoice (cuda) — 4.99× warm RTF, 1.49s warm TTFA

Model Device TTFA cold TTFA warm RTF cold RTF warm Peak RAM Peak VRAM Size
LuxTTScuda 436ms 241ms 18.76× 34.86× 2.35 GB 1.13 GB
OmniVoicecuda 2.01s 1.49s 3.80× 4.99× 2.05 GB 2.33 GB ~1B
Coqui XTTS-v2cuda 1.90s 1.53s 3.63× 4.77× 2.26 GB 2.08 GB 750M
Chatterbox Turbocuda 2.12s 1.61s 3.10× 4.28× 2.34 GB 3.28 GB 744M
Pocket-TTScpu 151ms 151ms 3.94× 3.86× 1.69 GB 100M
F5-TTScuda 2.49s 1.84s 2.33× 3.17× 2.61 GB 882 MB 330M
Qwen3-TTS 1.7B (CUDA-graph)cuda 51.60s 19.43s 2.86× 3.07× 2.33 GB 4.95 GB 1.7B
NeuTTS Nanocuda 952ms 357ms 1.85× 2.64× 3.45 GB 3.25 GB 748M
LuxTTScpu 2.92s 2.84s 2.14× 2.20× 2.62 GB
Chatterboxcuda 3.90s 3.11s 1.71× 2.16× 2.20 GB 3.66 GB 1.2B
MOSS-TTS-Nanocuda 4.61s 3.22s 1.54× 2.14× 1.91 GB 870 MB 100M
VoxCPM2 2Bcuda 4.24s 3.10s 1.44× 2.05× 3.37 GB 5.82 GB 2B
NeuTTS Nanocpu 1.01s 419ms 1.47× 1.68× 5.55 GB 748M
NeuTTS Aircuda 1.64s 478ms 1.16× 1.36× 3.88 GB 3.25 GB 748M
NeuTTS Aircpu 1.68s 587ms 1.07× 1.33× 5.98 GB 748M
IndexTTS-2cpu 6.72s 5.86s 0.97× 1.15× 2.52 GB 1.5B
MOSS-TTS-Nanocpu 6.85s 6.63s 1.00× 1.11× 1.31 GB 100M
IndexTTS-2cuda 7.72s 6.49s 0.90× 1.07× 2.80 GB 7.70 GB 1.5B
Coqui XTTS-v2cpu 11.08s 10.05s 0.77× 0.79× 3.09 GB 750M
Sesame CSM-1Bcuda 9.14s 8.23s 0.62× 0.74× 2.24 GB 3.54 GB 1B
Chatterbox Turbocpu 10.02s 10.36s 0.63× 0.67× 4.08 GB 744M
Dia 1.6Bcuda 12.83s 9.94s 0.42× 0.54× 2.11 GB 4.66 GB 1.6B
ZipVoice 123M (4/5 ok)cpu 15.97s 12.22s 0.41× 0.52× 51.51 GB 123M
Chatterboxcpu 21.31s 20.76s 0.32× 0.33× 4.50 GB 1.2B
VoxCPM2 2Bcpu 34.02s 33.01s 0.17× 0.18× 6.54 GB 2B
OmniVoicecpu 49.48s 47.44s 0.15× 0.16× 4.12 GB ~1B
Sesame CSM-1Bcpu 50.29s 51.41s 0.11× 0.12× 6.38 GB 1B
F5-TTScpu 75.56s 76.09s 0.08× 0.08× 2.63 GB 330M
Mars5-TTScuda 87.24s 87.80s 0.05× 0.05× 2.60 GB 7.24 GB 1.2B
Mars5-TTScpu 93.19s 85.83s 0.06× 0.05× 2.59 GB 1.2B
ZipVoice 123McudaSkipped — out of GPU memory (model exceeds this GPU's VRAM)