Autoregressive Entity Retrieval

Authors: Nicola De Cao, Gautier Izacard, Sebastian Riedel, Fabio Petroni

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show the efficacy of the approach, experimenting with more than 20 datasets on entity disambiguation, end-to-end entity linking and document retrieval tasks, achieving new state-of-the-art or very competitive results while using a tiny fraction of the memory footprint of competing systems. We extensively evaluate GENRE on more than 20 datasets across 3 tasks: Entity Disambiguation, end-to-end Entity Linking (EL), and page-level Document Retrieval.
Researcher Affiliation Collaboration Nicola De Cao1,2 , Gautier Izacard2,3,4, Sebastian Riedel2,5, Fabio Petroni2 1University of Amsterdam, 2Facebook AI Research 3ENS, PSL University, 4Inria, 5University College London nicola.decao@gmail.com, {gizacard, sriedel, fabiopetroni}@fb.com
Pseudocode No The paper describes its method in prose and uses diagrams but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Code and pre-trained models at https://github.com/facebookresearch/GENRE.
Open Datasets Yes As large generative models benefit from large amount of data, we first pre-train GENRE on the BLINK data (Wu et al., 2020), i.e., 9M unique triples document-mention-entity from Wikipedia. Then, for the in-domain scenario, we fine-tune using the AIDA-Co NLL dataset (Hoffart et al., 2011).
Dataset Splits Yes We pre-trained GENRE on BLINK data for 200k steps and then we do model selection on the validation set. Afterward, we fine-tuned on AIDA without resetting the learning rate nor the optimizer statistics for 10k steps and we do model selection on the validation set.
Hardware Specification No Training was done on 32 GPUs (with 32GB of memory) and it completed in 24h for a total of 32 GPU/day.
Software Dependencies No We implemented, trained, and evaluate our model using the fariseq library (Ott et al., 2019).
Experiment Setup Yes We trained GENRE for every task using Adam (Kingma & Ba, 2014) with a learning rate 3e-5 with a linear warm-up for 500 steps and then liner decay. The objective is sequence-to-sequence categorical cross-entropy loss with 0.1 of label smoothing.