How do Minimum-Norm Shallow Denoisers Look in Function Space?

Authors: Chen Zeno, Greg Ongie, Yaniv Blumenfeld, Nir Weinberger, Daniel Soudry

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically verify this alignment phenomenon on synthetic data and real images. We train a one-hidden layer Re LU network on a subset of N = 100 MNIST images for 10K iterations.
Researcher Affiliation Academia Chen Zeno Electrical and Computer Engineering Technion Greg Ongie Department Mathematical and Statistical Sciences Marquette University Yaniv Blumenfeld, Nir Weinberger, Daniel Soudry Electrical and Computer Engineering Technion {chenzeno,yanivbl}@campus.technion.ac.il, gregory.ongie@marquette.edu nirwein@technion.ac.il, daniel.soudry@gmail.com
Pseudocode No No pseudocode or algorithm block is provided in the paper.
Open Source Code No No concrete access to source code for the methodology is provided in the paper.
Open Datasets Yes We use the MNIST dataset to verify various properties. For instance, in the commonly used denoising benchmark BSD68 [Roth and Black, 2009], the noise level σ = 0.1 is in the low noise regime.
Dataset Splits No We train a one-hidden layer Re LU network on a subset of N = 100 MNIST images for 10K iterations. No specific train/validation/test split percentages or counts are mentioned.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments are provided in the paper.
Software Dependencies No We trained a single-layer FC Re LU network with linear residual connection for 1M epochs, with weight decay of 1E 8 (as described in our model), and ADAM optimizer with learning rate 1E 5. Software names are mentioned but no specific version numbers for libraries or frameworks.
Experiment Setup Yes We trained a one-hidden-layer Re LU network with a skip connection on a denoising task... We use λ = 10 5 in both setting. NN denoiser trained online using (7) for 100K iterations, (2) NN denoiser trained offline using (8) with M = 9000 and 20K epochs. We trained a single-layer FC Re LU network with linear residual connection for 1M epochs, with weight decay of 1E 8 (as described in our model), and ADAM optimizer with learning rate 1E 5.