reproducibilityindex.ai

Symmetries, Flat Minima, and the Conserved Quantities of Gradient Flow

Authors: Bo Zhao, Iordan Ganev, Robin Walters, Rose Yu, Nima Dehmamy

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present a set of experiments aimed at assessing the utility of the nonlinear group action and conserved quantities. A summary of the results are shown in Figure 2. We show that the value of conserved quantities can impact convergence rate and generalizability. We also find the nonlinear action to be viable for ensemble building to improve robustness under certain adversarial attacks.
Researcher Affiliation	Collaboration	Bo Zhao University of California, San Diego bozhao@ucsd.edu Iordan Ganev Radboud University iganev@cs.ru.nl Robin Walters Northeastern University r.walters@northeastern.edu Rose Yu University of California, San Diego roseyu@ucsd.edu Nima Dehmamy IBM Research nima.dehmamy@ibm.com
Pseudocode	No	The paper describes steps for an algorithm in paragraph form in Section 4 ('0. Input: weight matrices...', '1. Determine the spherical coordinates...') but does not present it as a structured pseudocode block or algorithm environment.
Open Source Code	Yes	Our code is available at https://github.com/Rose-STL-Lab/Gradient-Flow-Symmetry.
Open Datasets	Yes	We test the group action on CIFAR-10.
Dataset Splits	No	The paper mentions using CIFAR-10 but does not provide specific details on how the dataset was split into training, validation, or test sets.
Hardware Specification	No	No specific hardware details such as GPU/CPU models, memory, or cloud instance types are mentioned for running experiments.
Software Dependencies	No	No specific software dependencies with version numbers (e.g., 'Python 3.8, PyTorch 1.9') are mentioned in the paper.
Experiment Setup	Yes	We repeat the gradient descent with learning rate 0.1, 0.01, and 0.001. The learning rate is set to 10 3... U and V are initialized with different variance... The model contains a convolution layer with kernel size 3, followed by a max pooling, a fully connected layer, a leaky Re LU activation, and another fully connected layer.