A Meta-Analysis of Overfitting in Machine Learning
Authors: Rebecca Roelofs, Vaishaal Shankar, Benjamin Recht, Sara Fridovich-Keil, Moritz Hardt, John Miller, Ludwig Schmidt
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct the first large meta-analysis of overfitting due to test set reuse in the machine learning community. Our analysis is based on over one hundred machine learning competitions hosted on the Kaggle platform over the course of several years.In this paper, we empirically study holdout reuse at a significantly larger scale by analyzing data from 120 machine learning competitions on the popular Kaggle platform [2]. |
| Researcher Affiliation | Academia | Rebecca Roelofs UC Berkeley roelofs@berkeley.edu Sara Fridovich-Keil UC Berkeley sfk@berkeley.edu John Miller UC Berkeley miller_john@berkeley.edu Vaishaal Shankar UC Berkeley vaishaal@berkeley.edu Moritz Hardt UC Berkeley hardt@berkeley.edu Benjamin Recht UC Berkeley brecht@berkeley.edu Ludwig Schmidt UC Berkeley ludwig@berkeley.edu |
| Pseudocode | No | The paper does not contain any sections explicitly labeled 'Pseudocode' or 'Algorithm', nor are there any structured code-like blocks. |
| Open Source Code | No | The paper refers to existing platforms and datasets (e.g., 'Kaggle is the most widely used platform for machine learning competitions', 'Kaggle has released the Meta Kaggle dataset'), but there is no explicit statement that the authors are releasing their own source code for the methodology described in the paper. |
| Open Datasets | Yes | Kaggle has released the Meta Kaggle dataset2, which contains detailed information about competitions, submissions, etc. on the Kaggle platform. The structure of Kaggle competitions makes Meta Kaggle a useful dataset for investigating overfitting empirically at a large scale. 2https://www.kaggle.com/kaggle/meta-kaggle |
| Dataset Splits | Yes | Considering the danger of overfitting to the test set in a competitive environment, Kaggle subdivides each test set into public and private components. Table 1: The four accuracy competitions with the largest number of submissions. npublic is the size of the public test set and nprivate is the size of the private test set. |
| Hardware Specification | No | The paper describes a meta-analysis of Kaggle competition data and does not specify any particular hardware used for its own computational analysis. |
| Software Dependencies | No | The paper mentions common machine learning libraries like XGBoost [4] or scikit-learn [14] as potentially used by Kaggle competitors, but it does not provide specific software dependencies or version numbers for the authors' own analysis or experimental setup. |
| Experiment Setup | No | The paper discusses the 'experimental setup' of Kaggle competitions (Section 2.2) and the methodologies used for its analysis, but it does not provide specific experimental setup details such as hyperparameters, model initialization, or training schedules for its own meta-analysis. |