Implicit Jacobian regularization weighted with impurity of probability output

Authors: Sungyoon Lee, Jinseong Park, Jaewook Lee

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also evaluate the Explicit Jacobian Regularization, which outperforms state-of-the-art sharpness-aware optimization methods, SAM (Foret et al., 2021) and ASAM (Kwon et al., 2021) (Table 1 in Section 5.3).
Researcher Affiliation Academia 1Department of Computer Science, Hanyang University 2Department of Industrial Engineering, Seoul National University.
Pseudocode Yes Algorithm 1 Power iteration; Algorithm 2 Power iteration for the Jacobian
Open Source Code No The paper provides links to third-party libraries and implementations used (e.g., VGG, ResNet, Py Hessian, SAM implementation), but does not provide a link or explicit statement for the open-sourcing of its own core methodology (EJR).
Open Datasets Yes We use the CIFAR-10 dataset ((Krizhevsky & Hinton, 2009), https://www.cs.toronto.edu/~kriz/cifar.html) and the MNIST dataset which have C = 10 number of classes. We also conduct some experiments on the CIFAR-100 dataset with the number of classes C = 100.
Dataset Splits Yes The paper uses standard benchmark datasets like CIFAR-10, CIFAR-100, and MNIST, which have predefined and commonly used training and testing splits. While explicit percentages are not stated, the use of these standard datasets implies well-defined splits.
Hardware Specification No The paper mentions running experiments but does not provide specific details about the hardware used, such as GPU models, CPU types, or memory configurations.
Software Dependencies No The paper mentions using 'Pytorch code' and a 'Py Hessian' tool, but it does not specify version numbers for these or any other software libraries or dependencies, which would be necessary for reproducibility.
Experiment Setup Yes Table 3 and Table 4 detail the experimental settings including 'Batch Size', 'Initial lr', 'Epochs', 'lr scheduler', 'Momentum', and 'Weight Decay' for various models and datasets. For example, for Simple CNN on CIFAR-10, 'Initial lr' values like 0.01, 0.03 are given.