Information bottleneck theory of high-dimensional regression: relevancy, efficiency and optimality

Authors: Vudtiwat Ngampruetikorn, David J. Schwab

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Here we quantify overfitting via residual information, defined as the bits in fitted models that encode noise in training data. Information efficient learning algorithms minimize residual information while maximizing the relevant bits, which are predictive of the unknown generative models. We solve this optimization to obtain the information content of optimal algorithms for a linear regression problem and compare it to that of randomized ridge regression. Our results demonstrate the fundamental trade-off between residual and relevant information and characterize the relative information efficiency of randomized regression with respect to optimal algorithms. Finally, using results from random matrix theory, we reveal the information complexity of learning a linear map in high dimensions and unveil information-theoretic analogs of double and multiple descent phenomena.
Researcher Affiliation Academia Vudtiwat Ngampruetikorn,* David J. Schwab Initiative for the Theoretical Sciences, The Graduate Center, CUNY *vngampruetikorn@gc.cuny.edu
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper states 'N/A' for 'Did you include the code, data, and instructions needed to reproduce the main experimental results'.
Open Datasets No The paper defines a generative model and 'training data' as part of its theoretical setup ('training data of N iid samples, S= {(x1, y1), . . . , (xN, yN)}'), but it does not use or provide access to a specific publicly available dataset for empirical training.
Dataset Splits No The paper is theoretical and does not describe empirical experiments with dataset splits for validation.
Hardware Specification No The paper states 'N/A' for experimental details and does not mention any specific hardware used.
Software Dependencies No The paper states 'N/A' for experimental details and does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and describes parameters of its mathematical models, but not experimental setup details like hyperparameters or training configurations for empirical runs.