Fast Rates for Exp-concave Empirical Risk Minimization
Authors: Tomer Koren, Kfir Levy
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We establish the first evidence that ERM is able to attain fast generalization rates, and show that the expected loss of the ERM solution in d dimensions converges to the optimal expected loss in a rate of d/n. Our convergence analysis relies on stability arguments introduced by Bousquet and Elisseeff [4]. We prove that the expected loss of the regularized ERM solution does not change significantly when a single instance, picked uniformly at random from the training sample, is discarded. Then, the technique of Bousquet and Elisseeff [4] allows us to translate this average stability property into a generalization guarantee. Our proof of Theorem 1 proceeds as follows. First, we relate the expected excess risk of the ERM estimator bw to its average leave-one-out stability [4]. Then, we bound this stability in terms of certain local properties of the empirical risk at the point bw. The remainder of the section is devoted to the proof of Theorem 3. |
| Researcher Affiliation | Academia | Tomer Koren Technion Haifa 32000, Israel tomerk@technion.ac.il Kfir Y. Levy Technion Haifa 32000, Israel kfiryl@tx.technion.ac.il |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | The paper is theoretical and does not describe the use of specific datasets in experiments. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments or dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe the use of specific hardware for experiments. |
| Software Dependencies | No | The paper is theoretical and does not describe software dependencies with version numbers for experimental replication. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or training configurations. |