Activation-Descent Regularization for Input Optimization of ReLU Networks
Authors: Hongzhan Yu, Sicun Gao
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate the effectiveness of the proposed input-optimization methods for improving the state-of-the-art in various areas, such as adversarial learning, generative modeling, and reinforcement learning. |
| Researcher Affiliation | Academia | 1Department of Computer Science & Engineering, University of California San Diego, USA. Correspondence to: Hongzhan Yu <hoy021@ucsd.edu>. |
| Pseudocode | Yes | Algorithm 1 Activation-Descent Regularization GD (ADRGD) |
| Open Source Code | Yes | Codes are available at github.com/hoy021/ADR-GD. |
| Open Datasets | Yes | We evaluate the proposed algorithm on MNIST (Deng, 2012), CIFAR10 (Krizhevsky et al., 2009), and Image Net (Deng et al., 2009) datasets. |
| Dataset Splits | No | The paper mentions using standard datasets like MNIST, CIFAR10, and ImageNet, but does not explicitly specify the training/validation/test splits (e.g., percentages or sample counts) for its own experiments or reference a predefined split for reproducibility of its training process. |
| Hardware Specification | No | The paper provides runtime benchmarks but does not specify the hardware (e.g., GPU, CPU models, or cloud instances) used for running the experiments. |
| Software Dependencies | No | The paper mentions using "Torch Vision" to download a pre-trained model but does not specify a version number for Torch Vision or any other software dependencies. |
| Experiment Setup | Yes | The perturbation size ϵ is set to 0.2, 8/255, and 2/255 for the evaluations in MNIST, CIFAR10 and Image Net respectively. Appendix C: "Experiment Parameters" Table 2 lists detailed parameters for different experiments: T (total iterations), β0 (initial coefficient), αx, αη, αβ (step sizes), r (perturbation scale), Tp (perturbation frequency), γ (coefficient decay rate), δ, δβ (gradient tolerances). |