click any column header to sort

TTS Bench — Speed — mac-default

Rig: mac-m4 — Apple M4 (10C) · Apple M4 GPU (MPS) · 16 GB RAM · Darwin 25.5.0
Label: default voice
TTFA = time to first audio (ms; lower is better). RTF = real-time factor (× realtime; higher is better; e.g. 10× means 10 sec of audio generated per 1 sec of compute). Cold = first run after process start; warm = subsequent runs.

Speed winners

Fastest predefined-voice: Piper (cpu) — 32.14× warm RTF, 208ms warm TTFA

Fastest cloning-capable: Pocket-TTS (cpu) — 7.82× warm RTF, 44ms warm TTFA

Model Device TTFA cold TTFA warm RTF cold RTF warm Peak RAM Peak VRAM Size
Pipercpu 228ms 208ms 27.47× 32.14× 561 MB ~25MB
Kokoromps 920ms 489ms 7.58× 15.28× 1.26 GB 82M
Kokorocpu 914ms 749ms 7.78× 10.04× 2.49 GB 82M
KittenTTScpu 994ms 1.01s 8.10× 8.08× 385 MB <100M
Pocket-TTScpu 89ms 44ms 7.50× 7.82× 1.69 GB 100M
Soprano 80Mcpu 912ms 906ms 7.15× 7.15× 1.18 GB 80M
Soprano 80Mmps 2.48s 1.30s 2.76× 5.16× 1.64 GB 80M
Supertoniccpu 1.86s 1.57s 4.18× 4.81× 968 MB 99M
NeuTTS Nanocpu 740ms 266ms 2.30× 3.10× 5.83 GB 748M
NeuTTS Nanomps 1.45s 464ms 1.81× 2.96× 1.28 GB 748M
NeuTTS Aircpu 1.38s 358ms 1.55× 2.16× 6.12 GB 748M
NeuTTS Airmps 2.32s 583ms 1.30× 2.08× 1.55 GB 748M
Coqui XTTS-v2mps 17.81s 7.82s 1.14× 1.70× 1.84 GB 750M
Chatterbox Turbomps 9.95s 5.07s 0.75× 1.44× 2.30 GB 744M
Coqui XTTS-v2cpu 6.01s 5.48s 1.41× 1.43× 4.61 GB 750M
Chatterbox Turbocpu 6.53s 6.45s 1.07× 1.14× 4.80 GB 744M
VibeVoice Realtime 0.5Bmps 9.56s 8.30s 0.98× 1.13× 838 MB 0.5B
OmniVoice (4/5 ok)mps 5.26s 5.06s 0.83× 0.90× 981 MB ~1B
OmniVoicecpu 13.29s 11.23s 0.47× 0.57× 3.59 GB ~1B
Magpie-TTScpu 25.55s 26.19s 0.40× 0.40× 4.69 GB 357M
Chatterboxcpu 18.77s 17.08s 0.33× 0.37× 4.83 GB 1.2B
VibeVoice Realtime 0.5Bcpu 25.45s 25.10s 0.38× 0.36× 3.17 GB 0.5B
Chatterboxmps 25.95s 31.35s 0.27× 0.29× 698 MB 1.2B
Qwen3-TTS 1.7Bcpu 24.52s 25.11s 0.25× 0.24× 3.93 GB 1.7B
Sesame CSM-1Bcpu 38.67s 31.30s 0.19× 0.24× 6.08 GB 1B
VoxCPM2 2Bcpu 28.36s 52.13s 0.29× 0.17× 144 MB 2B
IndexTTS-2cpu 80.00s 48.40s 0.08× 0.14× 318 MB 1.5B
F5-TTSmps 45.43s 45.54s 0.09× 0.09× 1.29 GB 330M
F5-TTScpu 47.26s 47.15s 0.09× 0.09× 3.31 GB 330M
LuxTTScpuLuxTTS install failed (piper-phonemize has no Windows wheels)
LuxTTSmpsLuxTTS install failed (piper-phonemize has no Windows wheels)
ZipVoice 123McpuReference wav missing for voice cloning
ZipVoice 123MmpsReference wav missing for voice cloning
Mars5-TTScpuTimed out after 10 min — model too slow at this prompt length
VibeVoice 1.5BcpuReference wav missing for voice cloning
VibeVoice 1.5BmpsSkipped — out of GPU memory (model exceeds this device's memory)