Activation-Descent Regularization for Input Optimization of ReLU Networks

Authors: Hongzhan Yu, Sicun Gao

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate the effectiveness of the proposed input-optimization methods for improving the state-of-the-art in various areas, such as adversarial learning, generative modeling, and reinforcement learning.
Researcher Affiliation Academia 1Department of Computer Science & Engineering, University of California San Diego, USA. Correspondence to: Hongzhan Yu <hoy021@ucsd.edu>.
Pseudocode Yes Algorithm 1 Activation-Descent Regularization GD (ADRGD)
Open Source Code Yes Codes are available at github.com/hoy021/ADR-GD.
Open Datasets Yes We evaluate the proposed algorithm on MNIST (Deng, 2012), CIFAR10 (Krizhevsky et al., 2009), and Image Net (Deng et al., 2009) datasets.
Dataset Splits No The paper mentions using standard datasets like MNIST, CIFAR10, and ImageNet, but does not explicitly specify the training/validation/test splits (e.g., percentages or sample counts) for its own experiments or reference a predefined split for reproducibility of its training process.
Hardware Specification No The paper provides runtime benchmarks but does not specify the hardware (e.g., GPU, CPU models, or cloud instances) used for running the experiments.
Software Dependencies No The paper mentions using "Torch Vision" to download a pre-trained model but does not specify a version number for Torch Vision or any other software dependencies.
Experiment Setup Yes The perturbation size ϵ is set to 0.2, 8/255, and 2/255 for the evaluations in MNIST, CIFAR10 and Image Net respectively. Appendix C: "Experiment Parameters" Table 2 lists detailed parameters for different experiments: T (total iterations), β0 (initial coefficient), αx, αη, αβ (step sizes), r (perturbation scale), Tp (perturbation frequency), γ (coefficient decay rate), δ, δβ (gradient tolerances).