Variational Auto Encoders

Introduction In machine learning, there are broadly two classes of models: discriminative, and generative. Discriminative models take data (observations), and produce predictions. Generative models work by learning the distribution that produced the data, and using it to generate new samples. To be more precise, given observations $x$ and predictions $y$, a discriminative model seeks to learn $p(y|x)$, while a generative model seeks to learn $p(x)$. In this post I’ll explore one type of generative model, the variational autoencoder (VAE), first introduced by Kingma and Welling. VAEs are particularly suited to problems where the observed data is believed to be generated by an underlying set of unobserved latent variables — structure that is real but not directly measurable. The VAE addresses this by learning two things jointly: an encoder that maps observations to a distribution over latent variables, and a decoder that maps latent variables back to the data space. Together these give us both a generative model and a structured latent representation of the data. ...

April 6, 2026