The Star Geometry of Critic-Based Regularizer Learning
Authors: Oscar Leong, Eliza O'Reilly, Yong Sheng Soh
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also experimentally show that such losses can be competitive for learning regularizers in a simple denoising setting. An empirical comparison between neural network-based regularizers learned using these losses and the adversarial loss is presented in Section 3.1. |
| Researcher Affiliation | Academia | Oscar Leong Department of Statistics and Data Science University of California, Los Angeles oleong@stat.ucla.edu Eliza O Reilly Department of Applied Mathematics and Statistics Johns Hopkins University eoreill2@jh.edu Yong Sheng Soh Department of Mathematics National University of Singapore matsys@nus.edu.sg |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | No statement or link indicating open-source code availability for the described methodology. |
| Open Datasets | Yes | To do this, we consider denoising on the MNIST dataset [50]. We take 10000 random samples from the MNIST training set (constituting our Dr distribution) and add Gaussian noise with variance σ2 = 0.05 (constituting our Dn distribution). |
| Dataset Splits | No | The paper does not explicitly mention a validation set or provide details on how data was split for validation. |
| Hardware Specification | Yes | The experiments were run on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper mentions the use of the Adam optimizer, but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | They were trained using the adversarial loss and Hellinger-based loss (5). We also used the gradient penalty term from [53] for both losses. We used the Adam optimizer for 20000 epochs and learning rate 10 3. We ran gradient descent for 2000 iterations with a learning rate of 10 3. For the choice of regularization parameter λ, we note that in [53], the authors fix this value to be λ := 2 λ where λ := EN(0,σ2I)[ z ℓ2] as the regularizer that achieves a small gradient penalty will be (approximately) 1-Lipschitz. For the Hellinger-based network, we found that λ = 5.1 λ2 gave better performance, so we used this for recovery. We additionally tune the regularization strength for the adversarially trained regularizer and found λ = 0.75 λ performed better than the original fixed value. |