AudioMiniMaxvHD· Released 2025-02-19
MiniMax Speech HD
Affordable text-to-speech and voice cloning in 29 languages.
MiniMax Speech HD is MiniMax's high-definition text-to-speech model, released in February 2025. It supports 29 languages and voice cloning from a 30-second sample, running about 40% cheaper than ElevenLabs v3 with comparable quality for most use cases. On Oakgen, a 1-minute narration costs about 5 credits (~$0.02).
Capabilities at a glance
- 29 languages with preserved timbre across languages
- 30-second voice cloning from a clean sample
- 44.1 kHz MP3/WAV output
- 2–5 second latency per request
- ~40% cheaper than ElevenLabs v3
Specs
- Starting price
- $0.02 / generation
- Generation time
- 2–5 seconds
- Max resolution
- 44.1 kHz stereo
- Inputs → outputs
- text → audio
How to use MiniMax Speech HD
- 1Pick a stock voice or clone your ownBrowse 150+ stock voices by language, age, and use case. For a cloned voice, upload a 30-second clean sample.
- 2Paste your scriptUp to 50,000 characters per request — about 45 minutes of audio. Use punctuation for natural pauses.
- 3GenerateTypical 1-minute script finishes in 3–5 seconds. Download MP3 or WAV.
API access
curl -X POST https://api.oakgen.ai/v1/generate/speech \
-H "Authorization: Bearer $OAKGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "minimax-speech-hd",
"voice": "female_alloy",
"text": "Welcome to Oakgen."
}'Compared to other models
vs. elevenlabs-v3 — cost
MiniMax Speech HD is ~40% cheaper than ElevenLabs v3 with comparable quality for most narration. ElevenLabs v3 wins on emotional nuance, breath, and dramatic dialogue — pick it for audiobooks or cinema.
License & commercial use
Licensed through MiniMax's commercial terms.
Permitted on all paid Oakgen plans.
FAQs
How much does MiniMax Speech HD cost on Oakgen?
MiniMax Speech HD starts at $0.02 per generation on Oakgen. Most generations complete in 2–5 seconds. The $19/month Pro plan includes 5,000 credits, covering roughly 961 generations per month.
Can I use MiniMax Speech HD commercially?
Permitted on all paid Oakgen plans.
What is the maximum output resolution?
MiniMax Speech HD supports up to 44.1 kHz stereo.
Does Oakgen provide API access to MiniMax Speech HD?
Yes. Oakgen's REST API exposes MiniMax Speech HD under the model slug 'minimax-speech-hd'. See the API snippet below for an example request.