Residual Quantization with Implicit Neural Codebooks
Authors: Iris A.M. Huijben, Matthijs Douze, Matthew J. Muckley, Ruud Van Sloun, Jakob Verbeek
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that QINCo outperforms state-of-the-art methods by a large margin on several datasets and code sizes. |
| Researcher Affiliation | Collaboration | 1FAIR at Meta 2Eindhoven University of Technology. Correspondence to: Matthijs Douze <matthijs@meta.com>, Jakob Verbeek <jjverbeek@meta.com>. |
| Pseudocode | No | No pseudocode or algorithm blocks found. |
| Open Source Code | Yes | Code can be found at https://github.com/facebookresearch/QINCo. |
| Open Datasets | Yes | Deep1B (D=96) (Babenko & Lempitsky, 2016) and Big ANN (D=128) (Jégou et al., 2011) are widely-used benchmark datasets for VQ and similarity search that contain CNN image embeddings and SIFT descriptors, respectively. Facebook Sim Search Net++ (FB-ssnpp; D=256) (Simhadri et al., 2022) contains image embeddings intended for image copy detection that were generated using the SSCD model (Pizzi et al., 2022) for a challenge on approximate nearest neighbor search. It is considered challenging for indexing, as the vectors are spread far apart. SIFT1M (D=128) (Jégou et al., 2010) is a smaller-scale dataset of SIFT descriptors used for vector search benchmarks. |
| Dataset Splits | Yes | For all datasets, we use available data splits that include a database, a set of queries and a training set, and we hold out a set of 10k vectors from the original training set for validation, except for the smaller SIFT1M dataset for which we use 5k of the 100k vectors as validations vectors. |
| Hardware Specification | Yes | All timings are performed on 32 threads of a 2.2 GHz E5-2698 CPU with appropriate batch sizes. (...) The encoding time for the same QINCO model on a Tesla V100 GPU is 28.4 µs per vector. |
| Software Dependencies | Yes | QINCO and its variants were implemented in Pytorch 2.0.1 and trained using the Adam optimizer with default settings (Kingma & Ba, 2015) across eight GPUs with an effective batch size of 1,024. |
| Experiment Setup | Yes | We train models on 500k or 10M vectors (...) and perform early stopping based on the validation loss. During training, all data is normalized by dividing the vector components by their maximum absolute value in the training set. (...) QINCO and its variants were implemented in Pytorch 2.0.1 and trained using the Adam optimizer with default settings (Kingma & Ba, 2015) across eight GPUs with an effective batch size of 1,024. The base learning rate was reduced by a factor 10 every time the loss on the validation set did not improve for 10 epochs. We stopped training when the validation loss did not improve for 50 epochs. (...) For most experiments we use M {8, 16} quantization levels and vocabulary size K = 256, which we denote as 8 bytes and 16 bytes encoding. (...) We fix the hidden dimension to h = 256. |