Efficient Querying from Weighted Binary Codes
Authors: Zhenyu Weng, Yuesheng Zhu12346-12353
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments on three large-scale datasets validate both the search efficiency and the search accuracy of our method. Especially for the number of weighted binary codes up to one billion, our method shows a great improvement of more than 1000 times faster than the linear scan. |
| Researcher Affiliation | Academia | Zhenyu Weng, Yuesheng Zhu Communication and Information Security Laboratory, Shenzhen Graduate School, Peking University {wzytumbler, zhuys}@pku.edu.cn |
| Pseudocode | Yes | Algorithm 1 Querying with single-index hash table |
| Open Source Code | No | The paper does not provide any explicit statement or link regarding the availability of its source code. |
| Open Datasets | Yes | The Places205 dataset (Zhou et al. 2014) is a scenecentric dataset with 205 scene categories... GIST1M dataset (Jegou, Douze, and Schmid 2011) contains 1 million 960-D GIST descriptors... SIFT1B dataset (Jegou, Douze, and Schmid 2011) contains 1 billion 128-D SIFT descriptors and 10000 queries. |
| Dataset Splits | No | For Places205, the ground truth refers to as the true neighbors the identifiers that have the same label as the query. For GIST1M and SIFT1B, the ground truth refers to as the true neighbors the top 1000 identifiers selected by linear scan with the Euclidean distance from the query in the original space, i.e. Euclidean space. |
| Hardware Specification | Yes | All the experiments are run on a single core Intel Core-i7 CPU with 32GB of memory. |
| Software Dependencies | No | These querying methods are all implemented in C++. |
| Experiment Setup | Yes | For MIH and our method, MIWQ, we use the same heuristic (Norouzi, Punjani, and Fleet 2014) to determine the number of the substrings m, which is b/log2n where b is the length of the binary code and n is the data size. |