Nearest Neighbors Using Compact Sparse Codes
Authors: Anoop Cherian
ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments are conducted on two state-of-the-art computer vision datasets with 1M data points and show an order of magnitude improvement in retrieval accuracy without sacrificing memory and query time compared to the state-of-the-art methods. |
| Researcher Affiliation | Academia | Anoop Cherian ANOOP.CHERIAN@INRIA.FR INRIA, LEAR Project-team, Grenoble, France |
| Pseudocode | Yes | Algorithm 1 Sp ANN Indexing and Retrieval; Algorithm 2 IDL Algorithm |
| Open Source Code | No | The paper mentions using a third-party toolbox ("SPAMS toolbox (Mairal et al., 2010)") but does not state that the code for their own proposed methodology is open-source or provide a link. |
| Open Datasets | Yes | Our experiments are mainly based on the evaluation protocol of (Jegou et al., 2011) using two publicly available ANN datasets: (i) 1M SIFT and (ii) 1M GIST descriptors. |
| Dataset Splits | Yes | The first dataset is split into a training set with 100K 128-dimensional SIFT descriptors, a base set of 1M descriptors to be queried, and 10K query descriptors. Of the 100K training set, we use a random sample of 90K descriptors for learning the dictionary and 10K for validation. The GIST dataset consists of 960-dimensional descriptors and a training, database, and query step split of 500K, 1M, and 1K respectively. Of the training set, we use 400K descriptors for DL and 100K for validation. |
| Hardware Specification | Yes | Our timing comparisons are based on a single core 2.7 GHz AMD processor with 32GB memory. |
| Software Dependencies | No | The paper mentions using "SPAMS toolbox (Mairal et al., 2010)" and "MATLAB" but does not specify their version numbers or other software dependencies with version details. |
| Experiment Setup | Yes | For all the experiments, we used a fixed Jaccard threshold of η = 0.33. We found µ = 0.2 gave the best performance for dictionaries of sizes 256 and 512, while µ = 0.3 performed best for 1024 atoms. We use M = 32 for our SIFT experiments and M = 64 for GIST. |