reproducibilityindex.ai

An Information-Theoretic Analysis of In-Context Learning

Authors: Hong Jun Jeon, Jason D. Lee, Qi Lei, Benjamin Van Roy

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	We introduce new information-theoretic tools that lead to a concise yet general decomposition of error for a Bayes optimal predictor into two components: meta-learning error and intra-task error. These tools unify analyses across many meta-learning challenges. To illustrate, we apply them to establish new results about in-context learning with transformers and corroborate existing results a simple linear setting. Our theoretical results characterize how error decays in both the number of training sequences and sequence lengths.
Researcher Affiliation	Academia	1Department of Computer Science, Stanford University, Stanford, CA, USA 2Princeton University, Princeton, NJ, USA 3New York University, New York City, NY, USA 4Stanford University, Stanford, CA, USA.
Pseudocode	No	The paper contains mathematical derivations, theorems, and proofs but no explicit pseudocode or algorithm blocks.
Open Source Code	No	The paper does not contain any statements about releasing code or links to repositories for the methodology described.
Open Datasets	No	The paper is theoretical and analyzes abstract data generating processes. It does not mention or use any specific public datasets or provide access information for a dataset.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments. Therefore, it does not provide any dataset split information for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any experimental setup, thus no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not describe any experimental setup. It does not list specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup with hyperparameters, training configurations, or system-level settings.