New Bounds for Hyperparameter Tuning of Regression Problems Across Instances

Authors: Maria-Florina F. Balcan, Anh Nguyen, Dravyansh Sharma

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper investigates the sample complexity of tuning regularization parameters in linear and logistic regressions under ℓ1 and ℓ2-constraints in the data-driven setting. For the linear regression problem, by more carefully exploiting the structure of the dual function class, we provide a new upper bound for the pseudo-dimension of the validation loss function class, which significantly improves the best-known results on the problem. Remarkably, we also instantiate the first matching lower bound, proving our results are tight. For tuning the regularization parameters of logistic regression, we introduce a new approach to studying the learning guarantee via an approximation of the validation loss function class. We examine the pseudo-dimension of the approximation class and construct a uniform error bound between the validation loss function class and its approximation, which allows us to instantiate the first learning guarantee for the problem of tuning logistic regression regularization coefficients.
Researcher Affiliation Academia Maria-Florina Balcan Carnegie Mellon University ninamf@cs.cmu.edu Anh Tuan Nguyen Carnegie Mellon University atnguyen@cs.cmu.edu Dravyansh Sharma Carnegie Mellon University dravyans@cs.cmu.edu
Pseudocode Yes Algorithm 1 Approximate incremental quadratic algorithm for RLR with ℓ1 penalty, [2] Algorithm 2 Approximate incremental quadratic algorithm for RLR with ℓ2 penalty, [2]
Open Source Code No The paper does not provide any explicit statement about releasing source code or a link to a code repository.
Open Datasets No The paper describes abstract "regression dataset (X, y)" and "problem instances" for theoretical analysis, but does not mention the use of any specific publicly available datasets by name, nor does it provide links, DOIs, or formal citations for data access.
Dataset Splits No The paper describes general data structures like "training dataset with m samples and p features" and a "validation split with m samples" but does not specify concrete percentages or sample counts for a particular dataset, as it is a theoretical paper.
Hardware Specification No The paper does not provide any details about the specific hardware used to run experiments or computations.
Software Dependencies No The paper mentions names of statistical models and algorithms (e.g., Elastic Net, LASSO, Logistic Regression) but does not list any specific software packages, libraries, or solvers with version numbers.
Experiment Setup No The paper focuses on theoretical analysis and bounds; it does not provide concrete experimental setup details such as hyperparameter values, optimizer settings, or training schedules.