Questions: Variational Autoencoders (VAE)

5 questions to test your understanding

Score: 0 / 5
Question 1 Multiple Choice

A standard autoencoder trained on face images fails to generate new faces when you sample random points from latent space — the decoder produces noise. Why, and how does a VAE fix this?

AStandard autoencoders use tanh activations that prevent meaningful generation; VAEs use ReLU, which produces smoother latent spaces
BStandard autoencoders impose no structure on latent space, so random points fall in uncharted regions; the VAE's KL term regularizes latent codes into a dense, structured distribution
CStandard autoencoders encode to a single low-dimensional vector, which is too compressed for generation; VAEs use higher-dimensional latent spaces
DStandard autoencoders overfit to training data; VAEs add dropout regularization to prevent this
Question 2 Multiple Choice

The reparameterization trick in VAE training rewrites the sampling step as z = μ + σ·ε where ε ~ N(0,1). Why is this substitution necessary?

AIt prevents the KL divergence from becoming infinite when σ approaches zero
BIt allows gradients to flow through the sampling step back to the encoder parameters μ and σ
CIt ensures sampled z values stay within a bounded range to prevent numerical instability
DIt converts the Gaussian distribution to a uniform distribution, which is easier to implement
Question 3 True / False

Removing the KL divergence term from the VAE loss (training with reconstruction loss only) would cause the latent space to become unstructured, degrading generative capability.

TTrue
FFalse
Question 4 True / False

VAEs typically produce sharper, more detailed image outputs than GANs because the KL regularization enforces a well-organized latent space.

TTrue
FFalse
Question 5 Short Answer

Why does the KL divergence term in the ELBO loss force the latent space to become structured and usable for generation, rather than just functioning as an arbitrary regularizer?

Think about your answer, then reveal below.