reproducibilityindex.ai

A Neural Corpus Indexer for Document Retrieval

Authors: Yujing Wang, Yingyan Hou, Haonan Wang, Ziming Miao, Shibin Wu, Qi Chen, Yuqing Xia, Chengmin Chi, Guoshuai Zhao, Zheng Liu, Xing Xie, Hao Sun, Weiwei Deng, Qi Zhang, Mao Yang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical studies demonstrated the superiority of NCI on two commonly used academic benchmarks, achieving +21.4% and +16.8% relative enhancement for Recall@1 on NQ320k dataset and R-Precision on Trivia QA dataset, respectively, compared to the best baseline method.
Researcher Affiliation	Collaboration	1Microsoft 2Tsinghua University 3University of Illinois, Urbana Champaign 4Peking University
Pseudocode	Yes	The detailed procedure of hierarchical k-means will be described in Algorithm 1 in the Appendix B.2.
Open Source Code	No	The paper states in its checklist 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]'. However, the main body of the paper (Section 4.2 Implementation details) does not provide a specific URL or an explicit statement about the public availability of their own source code for the NCI methodology. It mentions using 'an open-source implementation [2]' for BM25, which refers to a third-party tool (Anserini), not their own.
Open Datasets	Yes	We conduct our experiments on two popular benchmarks for document retrieval, i.e., the Natural Questions [32] and Trivia QA dataset [29].
Dataset Splits	Yes	We use its predetermined training and validation split for evaluation.
Hardware Specification	Yes	All experiments are based on a cluster of NVIDIA V100 GPUs with 32GB memory. Each job takes 8 GPUs, resulting in a total batch size of 128 (16 * 8).
Software Dependencies	Yes	The Neural Corpus Indexer is implemented with python 3.6.10, Py Torch 1.8.1 and Hugging Face transformers 3.4.0.
Experiment Setup	Yes	All NCI experiments are based on a learning rate 2 × 10−4 for the encoder and 1 × 10−4 for the decoder with a batch size 16 per GPU. We set the scaling factor of the consistency-based regularization loss as α = 0.15 and the dropout ratio as 0.1.