Is the Discrete VAE’s Power Stuck in its Prior?

We investigate why probabilistic neural models with discrete latent variables are effective at generating high-quality images. We hypothesize that fitting a more flexible variational posterior distribution and performing joint training of the encoder, decoder, and prior distribution should improve model fit. However, we find that modifying the training procedure for the well-known vector quantized variational autoencoder (VQ-VAE) leads to models with lower marginal likelihood for held-out data and degraded sample quality. These results indicate that current discrete VAEs use their encoder and decoder as a deterministic compression bottleneck. The distribution-matching power of these models lies solely in the prior distribution, which is typically trained after clamping the encoder and decoder.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here