Hybrid Autoregressive Inference for Scalable Multi-Hop Explanation Regeneration
Authors: Marco Valentino, Mokanarangan Thayaparan, Deborah Ferreira, André Freitas11403-11411
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that the hybrid framework significantly outperforms previous sparse models, achieving performance comparable with that of state-of-the-art cross-encoders while being 50 times faster and scalable to corpora of millions of facts. |
| Researcher Affiliation | Academia | Marco Valentino1,2, Mokanarangan Thayaparan1,2, Deborah Ferreira1, Andr e Freitas1,2 1 University of Manchester, United Kingdom 2 Idiap Research Institute, Switzerland |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | Yes | Implementation and pre-trained models adopted for the experiments are available online2. 2https://github.com/ai-systems/hybrid autoregressive inference |
| Open Datasets | Yes | We perform an extensive evaluation on the World Tree corpus adopting the dataset released for the shared task on multi-hop explanation regeneration1 (Jansen and Ustalov 2019) |
| Dataset Splits | Yes | We adopt explanations and hypotheses in the training-set ( 1, 000) for training the dense encoder and computing the explanatory power for unseen hypotheses at inference time. We perform an extensive evaluation on the World Tree corpus adopting the dataset released for the shared task on multi-hop explanation regeneration1 (Jansen and Ustalov 2019)... The World Tree corpus provides a held-out test-set consisting of 1,240 science questions... The studies are performed on the dev-set since the explanations on the test-set are masked. |
| Hardware Specification | Yes | To this end, we run SCAR on 1 16GB Nvidia Tesla P100 GPU and compare the inference time with that of dense models executed on the same infrastructure (Cartuyvels, Spinks, and Moens 2020). |
| Software Dependencies | No | The paper mentions software like Sentence BERT, BM25, and FAISS, and specific models like bert-base-uncased, but does not provide version numbers for these software dependencies (e.g., PyTorch, TensorFlow, specific library versions). |
| Experiment Setup | Yes | The best results on explanation regeneration are obtained when running SCAR for 4 inference steps (additional details in Ablation Studies)... We found that the best results are obtained using 5 negative examples for each positive tuple. |