Bio-Inspired Hashing for Unsupervised Similarity Search

Authors: Chaitanya Ryali, John Hopfield, Leopold Grinberg, Dmitry Krotov

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that Bio Hash outperforms previously published benchmarks for various hashing methods. Since our learning algorithm is based on a local and biologically plausible synaptic plasticity rule, our work provides evidence for the proposal that LSH might be a computational reason for the abundance of sparse expansive motifs in a variety of biological systems. We also propose a convolutional variant Bio Conv Hash that further improves performance. From the perspective of computer science, Bio Hash and Bio Conv Hash are fast, scalable and yield compressed binary representations that are useful for similarity search. ... In this section, we empirically evaluate Bio Hash, investigate the role of sparsity in the latent space, and compare our results with previously published benchmarks. We consider two settings for evaluation: a) the training set contains unlabeled data, and the labels are only used for the evaluation of the performance of the hashing algorithm and b) where supervised pretraining on a different dataset is permissible. Features extracted from this pretraining are then used for hashing. In both settings Bio Hash outperforms previously published benchmarks for various hashing methods.
Researcher Affiliation Collaboration 1Department of CS and Engineering, UC San Diego 2MITIBM Watson AI Lab 3Princeton Neuroscience Institute, Princeton University 4IBM Research. Correspondence to: C.K. Ryali <rckrishn@eng.ucsd.edu>, D. Krotov <krotov@ibm.com>.
Pseudocode No The paper describes the learning dynamics using mathematical equations and text, but it does not include a dedicated pseudocode block or algorithm listing.
Open Source Code No The paper does not contain any statements about releasing code or links to a source code repository.
Open Datasets Yes To make our work comparable with recent related work, we used common benchmark datasets: a) MNIST (Lecun et al., 1998), a dataset of 70k grey-scale images (size 28 x 28) of hand-written digits with 10 classes of digits ranging from '0' to '9', b) CIFAR-10 (Krizhevsky, 2009), a dataset containing 60k images (size 32x32x3) from 10 classes (e.g: car, bird).
Dataset Splits Yes For each hash length k, we varied % of active neurons and evaluated the performance on a validation set (see appendix for details), see Figure 4.
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific names or version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes The hyperparameters of the method are p, r, m and . ... For MNIST and CIFAR-10, a was set to 0.05 and 0.005 respectively for all experiments.