Nearest Neighbors Using Compact Sparse Codes

Authors: Anoop Cherian

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments are conducted on two state-of-the-art computer vision datasets with 1M data points and show an order of magnitude improvement in retrieval accuracy without sacrificing memory and query time compared to the state-of-the-art methods.
Researcher Affiliation Academia Anoop Cherian ANOOP.CHERIAN@INRIA.FR INRIA, LEAR Project-team, Grenoble, France
Pseudocode Yes Algorithm 1 Sp ANN Indexing and Retrieval; Algorithm 2 IDL Algorithm
Open Source Code No The paper mentions using a third-party toolbox ("SPAMS toolbox (Mairal et al., 2010)") but does not state that the code for their own proposed methodology is open-source or provide a link.
Open Datasets Yes Our experiments are mainly based on the evaluation protocol of (Jegou et al., 2011) using two publicly available ANN datasets: (i) 1M SIFT and (ii) 1M GIST descriptors.
Dataset Splits Yes The first dataset is split into a training set with 100K 128-dimensional SIFT descriptors, a base set of 1M descriptors to be queried, and 10K query descriptors. Of the 100K training set, we use a random sample of 90K descriptors for learning the dictionary and 10K for validation. The GIST dataset consists of 960-dimensional descriptors and a training, database, and query step split of 500K, 1M, and 1K respectively. Of the training set, we use 400K descriptors for DL and 100K for validation.
Hardware Specification Yes Our timing comparisons are based on a single core 2.7 GHz AMD processor with 32GB memory.
Software Dependencies No The paper mentions using "SPAMS toolbox (Mairal et al., 2010)" and "MATLAB" but does not specify their version numbers or other software dependencies with version details.
Experiment Setup Yes For all the experiments, we used a fixed Jaccard threshold of η = 0.33. We found µ = 0.2 gave the best performance for dictionaries of sizes 256 and 512, while µ = 0.3 performed best for 1024 atoms. We use M = 32 for our SIFT experiments and M = 64 for GIST.