AutoCast++: Enhancing World Event Prediction with Zero-shot Ranking-based Context Retrieval
Authors: Qi Yan, Raihan Seraj, Jiawei He, Lili Meng, Tristan Sylvain
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results underscore marked improvements across multiple metrics, improving the performance for multiplechoice questions (MCQ) by 48% and true/false (TF) questions by up to 8%. |
| Researcher Affiliation | Collaboration | Qi Yan1 Raihan Seraj2 Jiawei He2 Lili Meng3 Tristan Sylvain2 1University of British Columbia 2Borealis AI 3Independent Researcher |
| Pseudocode | No | The paper describes the architecture and components of the model but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code is available at https://github.com/Borealis AI/Autocast-plus-plus. |
| Open Datasets | Yes | We assess our model on the Autocast dataset (Zou et al., 2022) |
| Dataset Splits | No | Table 1 provides 'Question Type Train Test Total' for the Autocast dataset, and the text mentions 'The dataset is partitioned with a cut-off in mid-2021 and questions in the test set span from mid-2021 to mid-2022.' While hyper-parameter optimization is mentioned, the specific details of a validation split are not provided. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using pre-trained models like GPT-3 and T5, and techniques like LoRA, but it does not specify version numbers for any software dependencies or libraries (e.g., Python, PyTorch, CUDA versions) required for reproducibility. |
| Experiment Setup | Yes | We initially retrieve K = 50 news articles using BM25 and proceed with our re-ranking process to select N = 10 unless otherwise specified. The reweighting coefficient λ in Eq. (6) is fixed at 0.1. |