The Learnability of In-Context Learning
Authors: Noam Wies, Yoav Levine, Amnon Shashua
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we propose a first-of-its-kind PAC based framework for in-context learnability, and use it to provide the first finite sample complexity results for the in-context learning setup. Our framework includes an initial pretraining phase, which fits a function to the pretraining distribution, and then a second in-context learning phase, which keeps this function constant and concatenates training examples of the downstream task in its input. Our theoretical analysis reveals that in this setting, in-context learning is more about identifying the task than about learning it, a result which is in line with a series of recent empirical findings. |
| Researcher Affiliation | Academia | Noam Wies, Yoav Levine & Amnon Shashua The Hebrew University of Jerusalem {noam.wies,yoav.levine,shashua}@cs.huji.ac.il |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper is theoretical and does not describe an implemented methodology for which source code would be provided. |
| Open Datasets | No | The paper is theoretical and does not use concrete, publicly available datasets for experiments. It discusses abstract “pretraining distributions” and “downstream task distributions”. |
| Dataset Splits | No | The paper is theoretical and does not report on experiments with dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe experimental procedures that would require specific hardware. No hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe experimental procedures that would require specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with specific hyperparameters or system-level training settings. |