Definition
Latent diffusion
The deep learning technique behind most modern AI video models: iteratively denoising a compressed (latent) representation of a video.
Latent diffusion is the technique behind most modern AI video and image models, including Sora 2, Veo 3.1, Kling, and Stable Diffusion. Instead of generating pixels directly, the model works in a 'latent space' — a compressed representation of an image or video. It starts from random noise in latent space and iteratively denoises toward a coherent output that matches the prompt. The denoised latent is then decoded back into pixel space. Latent diffusion is computationally much cheaper than pixel-space diffusion, which is what made high-resolution AI image and video generation practical. Most AI video models in VIBE use a variant of latent diffusion.
Related terms
Make AI video inside VIBE
19 AI video models. Free starter generations. iPhone, Android, and web.