Unrolled denoising networks provably learn to perform optimal Bayesian inference

Authors: Aayush Karan, Kulin Shah, Sitan Chen, Yonina Eldar

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this work, we prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP). For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network approximately converge to the same denoisers used in Bayes AMP. We also provide extensive numerical experiments for compressed sensing and rank-one matrix estimation demonstrating the advantages of our unrolled architecture in addition to being able to obliviously adapt to general priors, it exhibits improvements over Bayes AMP in more general settings of low dimensions, non-Gaussian designs, and non-product priors.
Researcher Affiliation Academia Aayush Karan Harvard SEAS akaran1@g.harvard.edu UT Austin kulinshah@utexas.edu Sitan Chen Harvard SEAS sitan@seas.harvard.edu Yonina C. Eldar Weizmann Institute of Science yonina.eldar@weizmann.ac.il
Pseudocode Yes Algorithm 1: Layerwise Training. Algorithm 2: Learning B.
Open Source Code No We currently don t provide the code for the open access but we are planning to do it soon.
Open Datasets No We randomly generated a train and validation dataset {𝑦𝑖,𝑥𝑖}𝑁 𝑖=1 with 𝑁= 215 samples by sampling from the prior and using Eq. (1).
Dataset Splits No We randomly generated a train and validation dataset {𝑦𝑖,𝑥𝑖}𝑁 𝑖=1 with 𝑁= 215 samples by sampling from the prior and using Eq. (1). Specific proportions or counts for the validation split are not provided.
Hardware Specification No The estimated amount of training time for our final experiments is around 100 CPU hours. This indicates CPU usage but does not provide specific CPU models, memory details, or other hardware specifications.
Software Dependencies No The paper mentions 'GELU activations' but does not specify software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x).
Experiment Setup Yes Implementation details. We set 𝑚= 250,𝑑= 500 and fix a random Gaussian sensing matrix 𝐴 ℝ250 500. We consider two choices of prior for our experiments: Bernoulli-Gaussian and ℤ2 (i.e. uniform over {1, 1}𝑛). For our unrolled architecture, the family F of learned MLP denoisers was restricted to three hidden layers, each with 70 neurons and GELU activations. This particular architectural choice was the most convenient for our experiments, but our experimental findings are not particularly sensitive to this. We randomly generated a train and validation dataset {𝑦𝑖,𝑥𝑖}𝑁 𝑖=1 with 𝑁= 215 samples by sampling from the prior and using Eq. (1). We train layerwise with finetuning as in Algorithm 1.