IDEAL: Influence-Driven Selective Annotations Empower In-Context Learners in Large Language Models
Authors: Shaokun Zhang, Xiaobo Xia, Zhaoqing Wang, Ling-Hao Chen, Jiale Liu, Qingyun Wu, Tongliang Liu
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments confirm the superiority of the proposed method on various benchmarks, achieving better performance under lower time consumption during subset selection. |
| Researcher Affiliation | Academia | Shaokun Zhang1 Xiaobo Xia2 Zhaoqing Wang2 Ling-Hao Chen3 Jiale Liu4 Qingyun Wu1 Tongliang Liu2 1Pennsylvania State University 2The University of Sydney 3Tsinghua University 4Xidian University |
| Pseudocode | Yes | Algorithm 1: Subset influence quantification. Algorithm 2: Searching the subset with maximum influence. |
| Open Source Code | Yes | The project page is available at https://skzhang1.github.io/IDEAL/. Source codes have been attached for the reproducibility of results. |
| Open Datasets | Yes | Following previous work (Su et al., 2023), we employ 9 datasets for the evaluations, which can be categorized into 4 different tasks, including classification, multi-choice, dialogue, and generation. The details of the datasets are provided in Appendix D.1. For each dataset, the original train/dev/test split from the Transformer library (Wolf et al., 2019) is utilized. |
| Dataset Splits | Yes | For each dataset, the original train/dev/test split from the Transformer library (Wolf et al., 2019) is utilized. |
| Hardware Specification | Yes | We run all our experiments of GPT-J 6B and GPT-Neo 2.7B on a single NVIDIA Tesla V100 (32GB) GPU. |
| Software Dependencies | No | The paper mentions software like PyTorch (Paszke et al., 2019) and Huggingface transformer library (Wolf et al., 2019) but does not provide specific version numbers for these software components, which are required for reproducibility. |
| Experiment Setup | Yes | The annotation budget is set to 18 and 100 respectively following the same setting as Vote-k. [...] We construct the directed graph for all unlabeled data by connecting each vertex to its 10 nearest successors (k = 10). [...] When quantifying the influence of the subset, we run Algorithm 1 10 times and use the averaged influence value. |