Original data · 2026-05-12
19 AI video models. One prompt. Same hardware.
We ran the same cinematic prompt through every AI video model in VIBE and scored the results on speed, quality, motion, and prompt adherence. Here's what we found.
The benchmark prompt
“Cinematic drone shot of a dragon made of living fire flying over a frozen Nordic fjord at twilight. Slow tracking camera, golden hour, deep blue water below.”
All 19 models ran the same prompt at default settings. Quality, motion, and adherence are subjective scores from 0-10 reviewed by 3 human raters. Generation times averaged over 3 runs.
🏆 Best overall quality
Sora 2 Pro
9.6/10
⚡ Fastest
LTX 2
9s
🎬 Best motion
Kling 3.0
9.4/10
🎯 Best prompt adherence
Sora 2 Pro
9.5/10
Full benchmark results
Sorted by overall quality.
| Rank | Model | Time | Quality | Motion | Adherence | Audio |
|---|---|---|---|---|---|---|
| #1 | Sora 2 Pro | 178s | 9.6 | 9.2 | 9.5 | Yes |
| #2 | Sora 2 | 92s | 9.2 | 8.9 | 9.3 | Yes |
| #3 | Google Veo 3.1 | 64s | 9.1 | 8.3 | 9.0 | Yes |
| #4 | Luma Ray Flash 2 | 52s | 8.7 | 8.0 | 8.1 | — |
| #5 | Kling 3.0 | 71s | 8.6 | 9.4 | 8.4 | — |
| #6 | Veo 3.1 Lite | 31s | 8.4 | 8.0 | 8.6 | Yes |
| #7 | Seedance 2.0 | 124s | 8.4 | 9.1 | 7.9 | — |
| #8 | Kling o3 | 78s | 8.3 | 8.9 | 7.6 | — |
| #9 | WAN 2.6 | 62s | 8.1 | 7.7 | 9.2 | — |
| #10 | Hailuo | 22s | 8.0 | 7.8 | 8.2 | — |
| #11 | Vidu Q3 | 47s | 7.9 | 7.8 | 7.8 | — |
| #12 | Pruna 5 | 34s | 7.7 | 7.5 | 7.8 | — |
| #13 | Seedance Pro Fast | 28s | 7.6 | 8.7 | 7.3 | — |
| #14 | Veo 3.1 Fast | 11s | 7.5 | 7.4 | 8.0 | Yes |
| #15 | Grok Imagine | 38s | 7.5 | 7.3 | 7.7 | — |
| #16 | WAN 2.2 | 24s | 7.5 | 7.2 | 8.6 | — |
| #17 | PixVerse 5.6 | 43s | 7.4 | 7.6 | 7.2 | — |
| #18 | LTX 2 | 9s | 6.8 | 7.0 | 7.0 | — |
| #19 | Happy Horse | 31s | 6.5 | 6.8 | 6.6 | — |
Per-model notes
Sora 2 Pro
Q 9.6 · M 9.2 · A 9.5 · 178sBest overall. Cinematic composition. Camera move respected. 4K-ready.
Sora 2
Q 9.2 · M 8.9 · A 9.3 · 92sExcellent — only a half-step behind Pro. Best quality-per-second among flagships.
Google Veo 3.1
Q 9.1 · M 8.3 · A 9 · 64sMost photoreal output of the lineup. Audio added ambient wind cleanly.
Luma Ray Flash 2
Q 8.7 · M 8 · A 8.1 · 52sBest atmosphere in the lineup. Lighting and mood — strongest of all 19.
Kling 3.0
Q 8.6 · M 9.4 · A 8.4 · 71sSmoothest motion in the lineup. The dragon's flapping wings won here.
Veo 3.1 Lite
Q 8.4 · M 8 · A 8.6 · 31sStrong middle tier. Hard to tell apart from full Veo at social sizes.
Seedance 2.0
Q 8.4 · M 9.1 · A 7.9 · 124sStrong motion. Less optimized for atmospheric shots — better for action.
Kling o3
Q 8.3 · M 8.9 · A 7.6 · 78sMore creative interpretation. Took risks with color — sometimes pays off.
WAN 2.6
Q 8.1 · M 7.7 · A 9.2 · 62sHighest prompt adherence score. Got the details right where others drifted.
Hailuo
Q 8 · M 7.8 · A 8.2 · 22sHighest reliability score across re-runs. Boring but bankable.
Vidu Q3
Q 7.9 · M 7.8 · A 7.8 · 47sSolid all-rounder. No single strength but no major weakness.
Pruna 5
Q 7.7 · M 7.5 · A 7.8 · 34sEfficient. Clean output for the compute used.
Seedance Pro Fast
Q 7.6 · M 8.7 · A 7.3 · 28sQuick motion-specialist tier.
Veo 3.1 Fast
Q 7.5 · M 7.4 · A 8 · 11sAstonishing speed. Loses some atmospheric detail but composition holds.
Grok Imagine
Q 7.5 · M 7.3 · A 7.7 · 38sSharper, more 'online' look — better for memes than for atmospheric shots.
WAN 2.2
Q 7.5 · M 7.2 · A 8.6 · 24sWAN's prompt adherence at faster compute.
PixVerse 5.6
Q 7.4 · M 7.6 · A 7.2 · 43sPushed the prompt toward a stylized look. Beautiful, but not photoreal.
LTX 2
Q 6.8 · M 7 · A 7 · 9sFastest in the lineup. Loses fine detail but composition is still readable.
Happy Horse
Q 6.5 · M 6.8 · A 6.6 · 31sNot built for cinematic prompts. Shines on character / meme content instead.
Run the same prompt yourself
VIBE includes all 19 models we tested. Try the benchmark prompt — or your own — and see which model wins for your use case.