Unrolled denoising networks provably learn to perform optimal Bayesian inference
Authors: Aayush Karan, Kulin Shah, Sitan Chen, Yonina Eldar
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this work, we prove the first rigorous learning guarantees for neural networks based on unrolling approximate message passing (AMP). For compressed sensing, we prove that when trained on data drawn from a product prior, the layers of the network approximately converge to the same denoisers used in Bayes AMP. We also provide extensive numerical experiments for compressed sensing and rank-one matrix estimation demonstrating the advantages of our unrolled architecture in addition to being able to obliviously adapt to general priors, it exhibits improvements over Bayes AMP in more general settings of low dimensions, non-Gaussian designs, and non-product priors. |
| Researcher Affiliation | Academia | Aayush Karan Harvard SEAS akaran1@g.harvard.edu UT Austin kulinshah@utexas.edu Sitan Chen Harvard SEAS sitan@seas.harvard.edu Yonina C. Eldar Weizmann Institute of Science yonina.eldar@weizmann.ac.il |
| Pseudocode | Yes | Algorithm 1: Layerwise Training. Algorithm 2: Learning B. |
| Open Source Code | No | We currently don t provide the code for the open access but we are planning to do it soon. |
| Open Datasets | No | We randomly generated a train and validation dataset {𝑦𝑖,𝑥𝑖}𝑁 𝑖=1 with 𝑁= 215 samples by sampling from the prior and using Eq. (1). |
| Dataset Splits | No | We randomly generated a train and validation dataset {𝑦𝑖,𝑥𝑖}𝑁 𝑖=1 with 𝑁= 215 samples by sampling from the prior and using Eq. (1). Specific proportions or counts for the validation split are not provided. |
| Hardware Specification | No | The estimated amount of training time for our final experiments is around 100 CPU hours. This indicates CPU usage but does not provide specific CPU models, memory details, or other hardware specifications. |
| Software Dependencies | No | The paper mentions 'GELU activations' but does not specify software dependencies with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x). |
| Experiment Setup | Yes | Implementation details. We set 𝑚= 250,𝑑= 500 and fix a random Gaussian sensing matrix 𝐴 ℝ250 500. We consider two choices of prior for our experiments: Bernoulli-Gaussian and ℤ2 (i.e. uniform over {1, 1}𝑛). For our unrolled architecture, the family F of learned MLP denoisers was restricted to three hidden layers, each with 70 neurons and GELU activations. This particular architectural choice was the most convenient for our experiments, but our experimental findings are not particularly sensitive to this. We randomly generated a train and validation dataset {𝑦𝑖,𝑥𝑖}𝑁 𝑖=1 with 𝑁= 215 samples by sampling from the prior and using Eq. (1). We train layerwise with finetuning as in Algorithm 1. |