Information theoretic limits of learning a sparse rule

Authors: Clément Luneau, jean barbier, Nicolas Macris

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We prove a variational formula for the asymptotic mutual information per sample when the system size grows to infinity. This result allows us to derive an expression for the minimum mean-square error (MMSE) of the Bayesian estimator when the signal entries have a discrete distribution with finite support. We find that, for such signals and suitable vanishing scalings of the sparsity and sampling rate, the MMSE is nonincreasing piecewise constant. Our analysis goes beyond the linear case and applies to learning the weights of a perceptron with general activation function in a teacher-student scenario. Our numerical experiments in Section 3 are for deterministic activations but all of our theoretical results hold for the broader setting.
Researcher Affiliation Academia Clément Luneau , Nicolas Macris Ecole Polytechnique Fédérale de Lausanne Suisse Jean Barbier International Center for Theoretical Physics Trieste, Italy
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any statement or link regarding the availability of open-source code for the methodology described.
Open Datasets No The paper does not describe the use of publicly available datasets for training experiments. It works with theoretical models and distributions.
Dataset Splits No The paper does not describe any training, validation, or test dataset splits as it focuses on theoretical derivations rather than empirical experiments on data.
Hardware Specification No The paper does not provide any specific hardware details used for running its theoretical computations or generating the plots.
Software Dependencies No The paper does not specify any software dependencies with version numbers needed to replicate the theoretical derivations or plots.
Experiment Setup No The paper discusses theoretical parameters of the generalized linear model and their asymptotic behavior. It does not provide specific experimental setup details such as hyperparameters, optimization settings, or training configurations typically associated with empirical machine learning experiments.