Risk Monotonicity in Statistical Learning

Authors: Zakaria Mhammedi

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we derive the first consistent and risk-monotonic (in high probability) algorithms for a general statistical learning setting under weak assumptions, consequently answering some questions posed by [53] on how to avoid non-monotonic behavior of risk curves. We further show that risk monotonicity need not necessarily come at the price of worse excess risk rates. To achieve this, we derive new empirical Bernstein-like concentration inequalities of independent interest that hold for certain non-i.i.d. processes such as Martingale Difference Sequences. In Section 3, we present our new concentration inequalities for Martingale Difference Sequences and loss processes we are interested in (those that satisfy Assumption 1 below). In Section 4, we present our risk-monotonic algorithm wrapper and show that it achieves risk monotonicity in high probability.
Researcher Affiliation Academia Zakaria Mhammedi Massachusetts Institute of Technology mhammedi@mit.edu Work done while at the Australian National University.
Pseudocode Yes Algorithm 1 A Risk Monotonic Algorithm Wrapper
Open Source Code No The paper does not provide any explicit statement about releasing code or a link to a code repository for the described methodology.
Open Datasets No The paper is theoretical and does not conduct experiments on a public dataset using its proposed methods. Figure 1 illustrates a concept using a synthetic 1D linear regression problem with two specific instances, which are not described as a publicly available dataset.
Dataset Splits No The paper is theoretical and does not describe empirical experiments or dataset usage beyond illustrative examples. Therefore, no training/test/validation dataset splits are provided.
Hardware Specification No The paper is theoretical and does not describe running empirical experiments. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and focuses on mathematical derivations and algorithm design, not empirical experimentation. Therefore, it does not specify any software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe running empirical experiments. As such, no experimental setup details, hyperparameters, or training settings are provided.