Spreading vectors for similarity search
Authors: Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This section presents our experimental results. We focus on the class of similarity search methods that represents the database vectors with a compressed representation... All experiments have two phases. |
| Researcher Affiliation | Collaboration | Facebook AI Research Inria |
| Pseudocode | No | The paper describes algorithmic steps but does not present them in a structured pseudocode block or explicitly label any section as 'Algorithm'. |
| Open Source Code | Yes | The code is available online1. 1https://github.com/facebookresearch/spreadingvectors |
| Open Datasets | Yes | We use two benchmark datasets Deep1M and Big Ann1M. Deep1M consists of the first million vectors of the Deep1B dataset (Babenko & Lempitsky, 2016). We also experiment with the Big Ann1M (J egou et al., 2011b), which consists of SIFT descriptors (Lowe, 2004). |
| Dataset Splits | Yes | Both datasets contain 1M vectors that serve as a reference set, 10k query vectors and a very large training set of which we use 500k elements for training, and 1M vectors that we use a base to cross-validate the hyperparameters dout and λ. |
| Hardware Specification | Yes | All timings are for Big Ann1M are on a 2.2 GHz machine with 40 threads. |
| Software Dependencies | No | The paper mentions using 'Faiss (Johnson et al., 2017) implementation of PQ and OPQ' but does not specify its version number or any other software dependencies with version numbers. |
| Experiment Setup | Yes | Our model is a 3 layer perceptron, with Re LU non-linearity and hidden dimension 1024. The final linear layer projects the dataset to the desired output dimension dout, along with ℓ2-normalization. We use batch normalization (Ioffe & Szegedy, 2015) and train our model for 300 epochs with Stochastic Gradient Descent, with an initial learning rate of 0.1 and a momentum of 0.9. The learning rate is decayed to 0.05 (resp. 0.01) at the 80-th epoch (resp. 120-th). |