Variational Autoencoders (VAEs) are probabilistic generative models that learn to encode data into a continuous latent space and reconstruct it through a decoder, introduced by Kingma and Welling in 2013. Unlike traditional autoencoders, VAEs impose a structured probabilistic distribution (typically Gaussian) on the latent space, enabling them to generate new, realistic samples by sampling from this learned distribution. The key insight is the reparameterization trick, which makes the stochastic sampling process differentiable, allowing end-to-end training via backpropagation. VAEs optimize the Evidence Lower Bound (ELBO), balancing reconstruction quality with regularization to prevent overfitting — a tension that makes them both powerful and nuanced to train effectively.
What This Cheat Sheet Covers
This topic spans 12 focused tables and 78 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Core Architecture Components
| Component | Example | Description |
|---|---|---|
z_mean, z_log_var = encoder(x)# Maps input to latent params | • Neural network that maps input x to parameters of a probability distribution (mean \mu and log-variance \log \sigma^2) in the latent space• typically uses CNN layers for images or fully connected layers for tabular data | |
x_reconstructed = decoder(z)# Maps latent code to output | • Neural network that reconstructs input from latent code z• mirrors encoder architecture in reverse, often using transposed convolutions for upsampling in image tasks | |
z ~ N(mu, sigma^2)# Gaussian distribution | • Low-dimensional continuous representation where each dimension ideally captures a meaningful factor of variation • enables smooth interpolation and generation of new samples | |
p(z) = N(0, I)# Standard Gaussian prior | • Assumed distribution over latent variables before observing data • typically standard normal \mathcal{N}(0, I) to simplify KL divergence computation and enable random sampling |