New Bounds for Hyperparameter Tuning of Regression Problems Across Instances
Authors: Maria-Florina F. Balcan, Anh Nguyen, Dravyansh Sharma
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This paper investigates the sample complexity of tuning regularization parameters in linear and logistic regressions under ℓ1 and ℓ2-constraints in the data-driven setting. For the linear regression problem, by more carefully exploiting the structure of the dual function class, we provide a new upper bound for the pseudo-dimension of the validation loss function class, which significantly improves the best-known results on the problem. Remarkably, we also instantiate the first matching lower bound, proving our results are tight. For tuning the regularization parameters of logistic regression, we introduce a new approach to studying the learning guarantee via an approximation of the validation loss function class. We examine the pseudo-dimension of the approximation class and construct a uniform error bound between the validation loss function class and its approximation, which allows us to instantiate the first learning guarantee for the problem of tuning logistic regression regularization coefficients. |
| Researcher Affiliation | Academia | Maria-Florina Balcan Carnegie Mellon University ninamf@cs.cmu.edu Anh Tuan Nguyen Carnegie Mellon University atnguyen@cs.cmu.edu Dravyansh Sharma Carnegie Mellon University dravyans@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Approximate incremental quadratic algorithm for RLR with ℓ1 penalty, [2] Algorithm 2 Approximate incremental quadratic algorithm for RLR with ℓ2 penalty, [2] |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository. |
| Open Datasets | No | The paper describes abstract "regression dataset (X, y)" and "problem instances" for theoretical analysis, but does not mention the use of any specific publicly available datasets by name, nor does it provide links, DOIs, or formal citations for data access. |
| Dataset Splits | No | The paper describes general data structures like "training dataset with m samples and p features" and a "validation split with m samples" but does not specify concrete percentages or sample counts for a particular dataset, as it is a theoretical paper. |
| Hardware Specification | No | The paper does not provide any details about the specific hardware used to run experiments or computations. |
| Software Dependencies | No | The paper mentions names of statistical models and algorithms (e.g., Elastic Net, LASSO, Logistic Regression) but does not list any specific software packages, libraries, or solvers with version numbers. |
| Experiment Setup | No | The paper focuses on theoretical analysis and bounds; it does not provide concrete experimental setup details such as hyperparameter values, optimizer settings, or training schedules. |