The Learnability of In-Context Learning

Authors: Noam Wies, Yoav Levine, Amnon Shashua

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we propose a first-of-its-kind PAC based framework for in-context learnability, and use it to provide the first finite sample complexity results for the in-context learning setup. Our framework includes an initial pretraining phase, which fits a function to the pretraining distribution, and then a second in-context learning phase, which keeps this function constant and concatenates training examples of the downstream task in its input. Our theoretical analysis reveals that in this setting, in-context learning is more about identifying the task than about learning it, a result which is in line with a series of recent empirical findings.
Researcher Affiliation Academia Noam Wies, Yoav Levine & Amnon Shashua The Hebrew University of Jerusalem {noam.wies,yoav.levine,shashua}@cs.huji.ac.il
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper is theoretical and does not describe an implemented methodology for which source code would be provided.
Open Datasets No The paper is theoretical and does not use concrete, publicly available datasets for experiments. It discusses abstract “pretraining distributions” and “downstream task distributions”.
Dataset Splits No The paper is theoretical and does not report on experiments with dataset splits for training, validation, or testing.
Hardware Specification No The paper is theoretical and does not describe experimental procedures that would require specific hardware. No hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and does not describe experimental procedures that would require specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe an experimental setup with specific hyperparameters or system-level training settings.