Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

On the Power of Foundation Models

Authors: Yang Yuan

ICML 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Unlike most machine learning theory papers, our paper does not have any assumptions on the data distribution or network structure. Instead, we take the bird s-eye view that is model oblivious, and only focuses on the structure deﬁned by the pretext task. It is indeed possible that by designing a special network, one may get a more powerful model with better performance. However, we stick with our setting because: Empirically, people do not customize network structures for different tasks. Instead, they tend to use similar structures like Res Net (He et al., 2016) or Transformer (Vaswani et al., 2017).
Researcher Affiliation	Collaboration	1IIIS, Tsinghua University 2Shanghai Artiﬁcial Intelligence Laboratory 3Shanghai Qi Zhi Institute. Correspondence to: Yang Yuan <EMAIL>.
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper is theoretical and does not describe a new methodology for which source code would typically be provided or released. There are no statements or links regarding open-source code for the described theoretical framework.
Open Datasets	No	The paper is theoretical and does not perform experiments that involve training on specific public datasets. It refers to training in other research as context but does not conduct its own empirical training.
Dataset Splits	No	The paper is theoretical and does not perform experiments, thus it does not provide specific dataset split information for training, validation, or testing.
Hardware Specification	No	The paper is theoretical and does not describe any empirical experiments, thus it does not specify any hardware used for running experiments.
Software Dependencies	No	The paper is theoretical and does not conduct experiments that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not describe any empirical experiments, and therefore does not include details on experimental setup such as hyperparameters or training configurations.