In-Context Learning Learns Label Relationships but Is Not Conventional Learning
Authors: Jannik Kossen, Yarin Gal, Tom Rainforth
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments show that ICL predictions almost always depend on in-context labels and that ICL can learn truly novel tasks in-context. |
| Researcher Affiliation | Academia | 1 OATML, Department of Computer Science, University of Oxford 2 Department of Statistics, University of Oxford |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We provide the code to reproduce our results at the following repository: github.com/jlko/in context learning. |
| Open Datasets | Yes | We evaluate on SST-2 (Socher et al., 2013), Subjective (Wang & Manning, 2012), Financial Phrasebank (Malo et al., 2014), Hate Speech (de Gibert et al., 2018), AG News (Zhang et al., 2015), Medical Questions Pairs (MQP) (Mc Creery et al., 2020), as well as Microsoft Research Paraphrase Corpus (MRPC) (Dolan & Brockett, 2005), Recognizing Textual Entailment (RTE) (Dagan et al., 2005), and Winograd Schema Challenge (WNLI) (Levesque et al., 2012) from GLUE (Wang et al., 2019). |
| Dataset Splits | No | The paper mentions using 'training set' and 'test set' from existing datasets but does not explicitly provide specific percentages, sample counts, or citations to predefined splits for these datasets within the text. |
| Hardware Specification | No | The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models. |
| Software Dependencies | No | The paper mentions using 'Hugging Face Python library (Wolf et al., 2020) and Py Torch (Paszke et al., 2019)' but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | We use the following simple templates to format the in-context examples. For SST-2, Subjectivity, Financial Phrasebank, Hate Speech, and our author identification task, we use the following line of Python code to format each input example: f"Sentence: {sentence} \n Answer: {label}\n\n". |