Is the Discrete VAE’s Power Stuck in its Prior?

NeurIPS Workshop ICBINB 2020 · Haydn Thomas Jones, Juston Moore ·

We investigate why probabilistic neural models with discrete latent variables are effective at generating high-quality images. We hypothesize that fitting a more flexible variational posterior distribution and performing joint training of the encoder, decoder, and prior distribution should improve model fit. However, we find that modifying the training procedure for the well-known vector quantized variational autoencoder (VQ-VAE) leads to models with lower marginal likelihood for held-out data and degraded sample quality. These results indicate that current discrete VAEs use their encoder and decoder as a deterministic compression bottleneck. The distribution-matching power of these models lies solely in the prior distribution, which is typically trained after clamping the encoder and decoder.

PDF Abstract