A Binary Variational Autoencoder for Hashing

Searching a large dataset to find elements that are similar to a sample object is a fundamental problem in computer science. Hashing algorithms deal with this problem by representing data with similarity-preserving binary codes that can be used as indices into a hash table. Recently, it has been shown that variational autoencoders (VAEs) can be successfully trained to learn such codes in unsupervised and semi-supervised scenarios. In this paper, we show that a variational autoencoder with binary latent variables leads to a more natural and effective hashing algorithm that its continuous counterpart. The model reduces the quantization error introduced by continuous formulations but is still trainable with standard back-propagation. Experiments on text retrieval tasks illustrate the advantages of our model with respect to previous art.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Text Retrieval 20 Newsgroups B-VAE Precision@100 0.441 # 1
Text Retrieval 20 Newsgroups VDSH Precision@100 0.319 # 3
Text Retrieval Reuters-21578 VDSH Precision@100 0.556 # 3
Text Retrieval Reuters-21578 B-VAE Precision@100 0.698 # 2

Methods


No methods listed for this paper. Add relevant methods here