Information theoretic limits of learning a sparse rule
Authors: Clément Luneau, jean barbier, Nicolas Macris
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We prove a variational formula for the asymptotic mutual information per sample when the system size grows to infinity. This result allows us to derive an expression for the minimum mean-square error (MMSE) of the Bayesian estimator when the signal entries have a discrete distribution with finite support. We find that, for such signals and suitable vanishing scalings of the sparsity and sampling rate, the MMSE is nonincreasing piecewise constant. Our analysis goes beyond the linear case and applies to learning the weights of a perceptron with general activation function in a teacher-student scenario. Our numerical experiments in Section 3 are for deterministic activations but all of our theoretical results hold for the broader setting. |
| Researcher Affiliation | Academia | Clément Luneau , Nicolas Macris Ecole Polytechnique Fédérale de Lausanne Suisse Jean Barbier International Center for Theoretical Physics Trieste, Italy |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper does not describe the use of publicly available datasets for training experiments. It works with theoretical models and distributions. |
| Dataset Splits | No | The paper does not describe any training, validation, or test dataset splits as it focuses on theoretical derivations rather than empirical experiments on data. |
| Hardware Specification | No | The paper does not provide any specific hardware details used for running its theoretical computations or generating the plots. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers needed to replicate the theoretical derivations or plots. |
| Experiment Setup | No | The paper discusses theoretical parameters of the generalized linear model and their asymptotic behavior. It does not provide specific experimental setup details such as hyperparameters, optimization settings, or training configurations typically associated with empirical machine learning experiments. |