reproducibilityindex.ai

A Meta-Analysis of Overfitting in Machine Learning

Authors: Rebecca Roelofs, Vaishaal Shankar, Benjamin Recht, Sara Fridovich-Keil, Moritz Hardt, John Miller, Ludwig Schmidt

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct the ﬁrst large meta-analysis of overﬁtting due to test set reuse in the machine learning community. Our analysis is based on over one hundred machine learning competitions hosted on the Kaggle platform over the course of several years.In this paper, we empirically study holdout reuse at a signiﬁcantly larger scale by analyzing data from 120 machine learning competitions on the popular Kaggle platform [2].
Researcher Affiliation	Academia	Rebecca Roelofs UC Berkeley roelofs@berkeley.edu Sara Fridovich-Keil UC Berkeley sfk@berkeley.edu John Miller UC Berkeley miller_john@berkeley.edu Vaishaal Shankar UC Berkeley vaishaal@berkeley.edu Moritz Hardt UC Berkeley hardt@berkeley.edu Benjamin Recht UC Berkeley brecht@berkeley.edu Ludwig Schmidt UC Berkeley ludwig@berkeley.edu
Pseudocode	No	The paper does not contain any sections explicitly labeled 'Pseudocode' or 'Algorithm', nor are there any structured code-like blocks.
Open Source Code	No	The paper refers to existing platforms and datasets (e.g., 'Kaggle is the most widely used platform for machine learning competitions', 'Kaggle has released the Meta Kaggle dataset'), but there is no explicit statement that the authors are releasing their own source code for the methodology described in the paper.
Open Datasets	Yes	Kaggle has released the Meta Kaggle dataset2, which contains detailed information about competitions, submissions, etc. on the Kaggle platform. The structure of Kaggle competitions makes Meta Kaggle a useful dataset for investigating overﬁtting empirically at a large scale. 2https://www.kaggle.com/kaggle/meta-kaggle
Dataset Splits	Yes	Considering the danger of overﬁtting to the test set in a competitive environment, Kaggle subdivides each test set into public and private components. Table 1: The four accuracy competitions with the largest number of submissions. npublic is the size of the public test set and nprivate is the size of the private test set.
Hardware Specification	No	The paper describes a meta-analysis of Kaggle competition data and does not specify any particular hardware used for its own computational analysis.
Software Dependencies	No	The paper mentions common machine learning libraries like XGBoost [4] or scikit-learn [14] as potentially used by Kaggle competitors, but it does not provide specific software dependencies or version numbers for the authors' own analysis or experimental setup.
Experiment Setup	No	The paper discusses the 'experimental setup' of Kaggle competitions (Section 2.2) and the methodologies used for its analysis, but it does not provide specific experimental setup details such as hyperparameters, model initialization, or training schedules for its own meta-analysis.