MEND: Meta Demonstration Distillation for Efficient and Effective In-Context Learning
Authors: Yichuan Li, Xiyao Ma, Sixing Lu, Kyumin Lee, Xiaohu Liu, Chenlei Guo
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Comprehensive evaluations across seven diverse ICL task partitions using decoder-only (GPT-2) and encoder-decoder (T5) attest to MEND s prowess. |
| Researcher Affiliation | Collaboration | 1Worcester Polytechnic Institute, 2Amazon Alexa AI {yli29,kmlee}@wpi.edu {maxiya,cynthilu,derecliu,guochenl}@amazon.com |
| Pseudocode | No | The paper does not contain any sections or figures explicitly labeled 'Pseudocode' or 'Algorithm', nor does it present structured steps formatted like code. |
| Open Source Code | Yes | 1The code is avaliable at https://github.com/bigheiniu/MEND. |
| Open Datasets | Yes | To validate our methodology, we employ the Meta ICL dataset introduced by Min et al. (2022a), designed for in-context learning scenarios. Meta ICL builds upon existing few-shot datasets, such as Cross Fit (Ye et al., 2021) and Unified QA (Khashabi et al., 2020). |
| Dataset Splits | Yes | Notably, the Meta ICL dataset is divided into two distinct partitions: meta-train and meta-test, with no overlap between them. This setting expect the model first trained on meta-train then evaluated on meta-test dataset. |
| Hardware Specification | Yes | All experiments were conducted on eight A10 NVIDIA GPUs, each equipped with 24GB of memory. |
| Software Dependencies | Yes | We implemented our proposed methodology using Py Torch v1.13.1 (Paszke et al., 2019), complemented by the Hugging Face Transformers library v4.24.0 (Wolf et al., 2019) and Accelerate v0.20.0 (Gugger et al., 2022). |
| Experiment Setup | Yes | The complete set of stable hyperparameters for training runs can be found in Tab. 5. These parameters are adapted from Meta ICL (Min et al., 2022a). Additional hyperparameters that needed exploration and their corresponding search spaces are also detailed in Tab. 5. |