VideoGoogle DeepMindv3· Released 2025-05-20

Veo 3

Text-to-video with native synchronized audio — dialogue, ambient sound, and music in one pass.

Veo 3 is Google DeepMind's flagship text-to-video model released in May 2025. It's the only widely-available model that generates video and synchronized audio (dialogue, ambient sound, music cues) in a single pass. On Oakgen, an 8-second 1080p clip costs about 480 credits (~$1.85) and renders in 60–120 seconds.

Try Veo 3 →See pricing

Capabilities at a glance

8-second clips at up to 1920×1080 (1080p)
Native synchronized audio — dialogue, ambient, music cues
Cinematic physics and camera motion
60–120 second generation on Oakgen
Veo 3 Fast variant available for 35% lower credit cost

Specs

Starting price: $1.85 / generation
Generation time: 60–120 seconds
Max resolution: 1920×1080
Inputs → outputs: text, image → video, audio

How to use Veo 3

1
Describe camera, subject, and audio
Veo 3 handles audio cues in the prompt. Example: 'Slow dolly-in on a bustling Tokyo street at night, neon reflections, ambient city sounds with distant traffic'.
2
Set aspect ratio
16:9 for YouTube, 9:16 for Reels/Shorts/TikTok, 1:1 for feed. All three render at full 1080p.
3
Generate
Typical 8-second clip renders in 60–120 seconds. Longer queues during peak hours; Oakgen automatically retries on FAL or WaveSpeed.

API access

curl -X POST https://api.oakgen.ai/v1/generate/video \
  -H "Authorization: Bearer $OAKGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3",
    "prompt": "Slow dolly-in on a Tokyo street at night, neon reflections",
    "duration": 8,
    "aspect_ratio": "16:9"
  }'

Compared to other models

vs. sora-2 — native audio

Veo 3 generates audio in the same pass as video — no separate TTS or sound design step. Sora 2 is silent, requires a second pass through an audio model.

vs. kling-v2 — cinematic quality

Veo 3 has stronger cinematic physics and camera coherence. Kling v2 Pro is competitive on human motion and costs roughly half as much per clip.

License & commercial use

Licensed through Google's commercial terms.

Permitted on all paid Oakgen plans. Google's usage policy applies.

FAQs

How much does Veo 3 cost on Oakgen?

Veo 3 starts at $1.85 per generation on Oakgen. Most generations complete in 60–120 seconds. The $19/month Pro plan includes 5,000 credits, covering roughly 10 generations per month.

Can I use Veo 3 commercially?

Permitted on all paid Oakgen plans. Google's usage policy applies.

What is the maximum output resolution?

Veo 3 supports up to 1920×1080.

Does Oakgen provide API access to Veo 3?

Yes. Oakgen's REST API exposes Veo 3 under the model slug 'veo-3'. See the API snippet below for an example request.

Related models

Sora 2

Cinematic text-to-video with industry-leading long-range coherence and physics.

Kling v2

Text-to-video with the most realistic human motion of any AI model.