reproducibilityindex.ai

Autoregressive Entity Retrieval

Authors: Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show the efﬁcacy of the approach, experimenting with more than 20 datasets on entity disambiguation, end-to-end entity linking and document retrieval tasks, achieving new state-of-the-art or very competitive results while using a tiny fraction of the memory footprint of competing systems. We extensively evaluate GENRE on more than 20 datasets across 3 tasks: Entity Disambiguation, end-to-end Entity Linking (EL), and page-level Document Retrieval.
Researcher Affiliation	Collaboration	Nicola De Cao1,2 , Gautier Izacard2,3,4, Sebastian Riedel2,5, Fabio Petroni2 1University of Amsterdam, 2Facebook AI Research 3ENS, PSL University, 4Inria, 5University College London nicola.decao@gmail.com, {gizacard, sriedel, fabiopetroni}@fb.com
Pseudocode	No	The paper describes its method in prose and uses diagrams but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code and pre-trained models at https://github.com/facebookresearch/GENRE.
Open Datasets	Yes	As large generative models beneﬁt from large amount of data, we ﬁrst pre-train GENRE on the BLINK data (Wu et al., 2020), i.e., 9M unique triples document-mention-entity from Wikipedia. Then, for the in-domain scenario, we ﬁne-tune using the AIDA-Co NLL dataset (Hoffart et al., 2011).
Dataset Splits	Yes	We pre-trained GENRE on BLINK data for 200k steps and then we do model selection on the validation set. Afterward, we ﬁne-tuned on AIDA without resetting the learning rate nor the optimizer statistics for 10k steps and we do model selection on the validation set.
Hardware Specification	No	Training was done on 32 GPUs (with 32GB of memory) and it completed in 24h for a total of 32 GPU/day.
Software Dependencies	No	We implemented, trained, and evaluate our model using the fariseq library (Ott et al., 2019).
Experiment Setup	Yes	We trained GENRE for every task using Adam (Kingma & Ba, 2014) with a learning rate 3e-5 with a linear warm-up for 500 steps and then liner decay. The objective is sequence-to-sequence categorical cross-entropy loss with 0.1 of label smoothing.