Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

Learning to Hash with Binary Reconstructive Embeddings

Brian Kulis and Trevor Darrell

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2009-101
July 16, 2009

http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-101.pdf

Fast retrieval methods are increasingly critical for many large-scale analysis tasks, and there have been several recent methods that attempt to learn hash functions for fast and accurate nearest neighbor searches. In this paper, we develop an algorithm for learning hash functions based on explicitly minimizing the reconstruction error between the original distances and the Hamming distances of the corresponding binary embeddings. We develop a scalable coordinate-descent algorithm for our proposed hashing objective that is able to efficiently learn hash functions in a variety of settings. Unlike existing methods such as semantic hashing and spectral hashing, our method is easily kernelized and does not require restrictive assumptions about the underlying distribution of the data. We present results over several domains to demonstrate that our method outperforms existing state-of-the-art techniques.


BibTeX citation:

@techreport{Kulis:EECS-2009-101,
    Author = {Kulis, Brian and Darrell, Trevor},
    Title = {Learning to Hash with Binary Reconstructive Embeddings},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2009},
    Month = {Jul},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-101.html},
    Number = {UCB/EECS-2009-101},
    Abstract = {Fast retrieval methods are increasingly critical for many large-scale analysis tasks, and there have been several recent methods that attempt to learn hash functions for fast and accurate nearest neighbor searches.  In this paper, we develop an algorithm for learning hash functions based on explicitly minimizing the reconstruction error between the original distances and the Hamming distances of the corresponding binary embeddings.  We develop a scalable coordinate-descent algorithm for our proposed hashing objective that is able to efficiently learn hash functions in a variety of settings.  Unlike existing methods such as semantic hashing and spectral hashing, our method is easily kernelized and does not require restrictive assumptions about the underlying distribution of the data.  We present results over several domains to demonstrate that our method outperforms existing state-of-the-art techniques.}
}

EndNote citation:

%0 Report
%A Kulis, Brian
%A Darrell, Trevor
%T Learning to Hash with Binary Reconstructive Embeddings
%I EECS Department, University of California, Berkeley
%D 2009
%8 July 16
%@ UCB/EECS-2009-101
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-101.html
%F Kulis:EECS-2009-101