Minimizers of the Empirical Risk and Risk Monotonicity
Authors: Marco Loog, Tom Viering, Alexander Mey
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our work introduces the formal notion of risk monotonicity, which asks the risk to not deteriorate with increasing training set sizes in expectation over the training samples. We then present the surprising result that various standard learners, specifically those that minimize the empirical risk, can act nonmonotonically irrespective of the training sample size. We provide a theoretical underpinning for specific instantiations from classification, regression, and density estimation. Altogether, the proposed monotonicity notion opens up a whole new direction of research. Section 5 provides some experimental evidence for some cases of interest that have, up to now, resisted any deeper theoretical analysis. |
| Researcher Affiliation | Academia | Marco Loog Delft University of Technology & University of Copenhagen Tom Viering Delft University of Technology Alexander Mey Delft University of Technology |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not include any explicit statements or links indicating that open-source code for the methodology described is provided. |
| Open Datasets | No | The paper describes experiments on custom-defined distributions (e.g., 'a = (1,1) and b = (-1/10,1)', 'supported on three points: a = (1,1), b = (-1/10, 1), and c = (-1,1)'). These are not named public datasets, and no information is provided about their public availability or access. |
| Dataset Splits | No | The paper does not specify exact training, validation, or test dataset splits in terms of percentages, sample counts, or citations to predefined splits. It discusses 'training samples' generally but not specific experimental data partitioning. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with versions) needed to replicate the experiment. |
| Experiment Setup | No | The paper mentions 'a small amount of standard L2-regularization decreasing with training size (λ = 0.01/n)' for one specific experiment in Section 5. However, this is not a comprehensive description of the experimental setup, and it lacks other concrete hyperparameter values or system-level training settings for the models presented. |