Veo 3
Text-to-video with native synchronized audio — dialogue, ambient sound, and music in one pass.
Veo 3 is Google DeepMind's flagship text-to-video model released in May 2025. It's the only widely-available model that generates video and synchronized audio (dialogue, ambient sound, music cues) in a single pass. On Oakgen, an 8-second 1080p clip costs about 480 credits (~$1.85) and renders in 60–120 seconds.
Capabilities at a glance
- 8-second clips at up to 1920×1080 (1080p)
- Native synchronized audio — dialogue, ambient, music cues
- Cinematic physics and camera motion
- 60–120 second generation on Oakgen
- Veo 3 Fast variant available for 35% lower credit cost
Specs
- Starting price
- $1.85 / generation
- Generation time
- 60–120 seconds
- Max resolution
- 1920×1080
- Inputs → outputs
- text, image → video, audio
How to use Veo 3
- 1Describe camera, subject, and audioVeo 3 handles audio cues in the prompt. Example: 'Slow dolly-in on a bustling Tokyo street at night, neon reflections, ambient city sounds with distant traffic'.
- 2Set aspect ratio16:9 for YouTube, 9:16 for Reels/Shorts/TikTok, 1:1 for feed. All three render at full 1080p.
- 3GenerateTypical 8-second clip renders in 60–120 seconds. Longer queues during peak hours; Oakgen automatically retries on FAL or WaveSpeed.
API access
curl -X POST https://api.oakgen.ai/v1/generate/video \
-H "Authorization: Bearer $OAKGEN_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "veo-3",
"prompt": "Slow dolly-in on a Tokyo street at night, neon reflections",
"duration": 8,
"aspect_ratio": "16:9"
}'Compared to other models
Veo 3 generates audio in the same pass as video — no separate TTS or sound design step. Sora 2 is silent, requires a second pass through an audio model.
Veo 3 has stronger cinematic physics and camera coherence. Kling v2 Pro is competitive on human motion and costs roughly half as much per clip.
License & commercial use
Licensed through Google's commercial terms.
Permitted on all paid Oakgen plans. Google's usage policy applies.