VideoGoogle DeepMindv3· Released 2025-05-20

Veo 3

Text-to-video with native synchronized audio — dialogue, ambient sound, and music in one pass.

Veo 3 is Google DeepMind's flagship text-to-video model released in May 2025. It's the only widely-available model that generates video and synchronized audio (dialogue, ambient sound, music cues) in a single pass. On Oakgen, an 8-second 1080p clip costs about 480 credits (~$1.85) and renders in 60–120 seconds.

Capabilities at a glance

  • 8-second clips at up to 1920×1080 (1080p)
  • Native synchronized audio — dialogue, ambient, music cues
  • Cinematic physics and camera motion
  • 60–120 second generation on Oakgen
  • Veo 3 Fast variant available for 35% lower credit cost

Specs

Starting price
$1.85 / generation
Generation time
60–120 seconds
Max resolution
1920×1080
Inputs → outputs
text, imagevideo, audio

How to use Veo 3

  1. 1
    Describe camera, subject, and audio
    Veo 3 handles audio cues in the prompt. Example: 'Slow dolly-in on a bustling Tokyo street at night, neon reflections, ambient city sounds with distant traffic'.
  2. 2
    Set aspect ratio
    16:9 for YouTube, 9:16 for Reels/Shorts/TikTok, 1:1 for feed. All three render at full 1080p.
  3. 3
    Generate
    Typical 8-second clip renders in 60–120 seconds. Longer queues during peak hours; Oakgen automatically retries on FAL or WaveSpeed.

API access

curl -X POST https://api.oakgen.ai/v1/generate/video \
  -H "Authorization: Bearer $OAKGEN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "veo-3",
    "prompt": "Slow dolly-in on a Tokyo street at night, neon reflections",
    "duration": 8,
    "aspect_ratio": "16:9"
  }'

Compared to other models

vs. sora-2native audio

Veo 3 generates audio in the same pass as video — no separate TTS or sound design step. Sora 2 is silent, requires a second pass through an audio model.

vs. kling-v2cinematic quality

Veo 3 has stronger cinematic physics and camera coherence. Kling v2 Pro is competitive on human motion and costs roughly half as much per clip.

License & commercial use

Licensed through Google's commercial terms.

Permitted on all paid Oakgen plans. Google's usage policy applies.

FAQs

How much does Veo 3 cost on Oakgen?
Veo 3 starts at $1.85 per generation on Oakgen. Most generations complete in 60–120 seconds. The $19/month Pro plan includes 5,000 credits, covering roughly 10 generations per month.
Can I use Veo 3 commercially?
Permitted on all paid Oakgen plans. Google's usage policy applies.
What is the maximum output resolution?
Veo 3 supports up to 1920×1080.
Does Oakgen provide API access to Veo 3?
Yes. Oakgen's REST API exposes Veo 3 under the model slug 'veo-3'. See the API snippet below for an example request.

Related models

Veo 3 on Oakgen — Text-to-Video with Native Audio | Oakgen.ai