reproducibilityindex.ai

In-Context Learning through the Bayesian Prism

Authors: Madhur Panwar, Kabir Ahuja, Navin Goyal

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this paper we empirically examine how far this Bayesian perspective can help us understand ICL.
Researcher Affiliation	Collaboration	Microsoft Research India {t-mpanwar, navingo}@microsoft.com University of Washington kahuja@cs.washington.edu
Pseudocode	No	The paper describes methods and equations but does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We release our code at https://github.com/mdrpanwar/icl-bayesian-prism
Open Datasets	No	In all of our experiments except the ones concerning the Fourier series, we choose DX as the standard normal distribution i.e. N(0, 1), unless specified otherwise. The paper describes how data is generated from distributions (e.g., standard normal) for functions, not referring to a pre-existing, downloadable public dataset.
Dataset Splits	No	The paper generates data dynamically from specified distributions for training and evaluation (e.g., 'x i Rd and are chosen i.i.d. from a distribution, and f : Rd R is a function from a family of functions'). It does not specify fixed training, validation, or test dataset splits.
Hardware Specification	Yes	Our experiments were conducted on a system comprising 32 NVIDIA V100 16GB GPUs.
Software Dependencies	No	The paper mentions using 'Pytorch', 'Huggingface Transformers', 'scikit-learn', and 'CVXPY' for implementation and baselines, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	Unless specified otherwise, we use 12 layers, 8 heads, and a hidden size (dh) of 256 in the architecture for all of our experiments. We use a batch size of 64 and train the model for 500k steps. We use Adam optimizer... We train all of our models with curriculum...