reproducibilityindex.ai

On the Parameterization of Second-Order Optimization Effective towards the Infinite Width

Authors: Satoki Ishikawa, Ryo Karakida

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically verify the effectiveness of our proposed parameterization in the training of various neural networks. In particular, it enables us to transfer optimal learning rates and damping terms from narrow models to wider ones (in Section 5.2, 5.3).
Researcher Affiliation	Academia	Satoki Ishikawa Department of Computer Science Tokyo Institute of Technology, Japan riverstone@rio.gsic.titech.ac.jp Ryo Karakida Artificial Intelligence Research Center AIST, Japan karakida.ryo@aist.go.jp
Pseudocode	No	The paper contains mathematical derivations and descriptions of methods but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper states: "In all experiments, we implemented second-order optimization based on the ASDL library (Osawa et al., 2023a)." However, it does not provide concrete access to the source code for the specific methodology described in this paper, nor a general statement about code release.
Open Datasets	Yes	Figure 1 In the upper graph, we trained a 3-layer MLP on the MNIST dataset... In the second graph, we trained a Myrtle-5 on the CIFAR10... Figure 4 (Left) We trained CBOW on Wiki Text2... (Right) We trained Res Net18 on CIFAR100... Figure 5 We trained Res Net50 on Image Net...
Dataset Splits	No	The paper mentions reducing dataset size for some experiments (e.g., "The training sets have been reduced to 256 samples", "The number of samples is reduced to 1024") and "Validation accuracy is highest..." but does not provide specific details on the proportions or counts of training, validation, and test splits for reproducibility.
Hardware Specification	No	The paper does not explicitly describe the hardware specifications (e.g., specific GPU/CPU models, memory amounts) used to run the experiments.
Software Dependencies	No	The paper states: "In all experiments, we implemented second-order optimization based on the ASDL library (Osawa et al., 2023a)." However, it does not provide specific version numbers for ASDL, PyTorch, or any other key software dependencies.
Experiment Setup	Yes	Section B.2 "DETAILS OF FIGURES" provides extensive experimental setup details, including learning rates (e.g., "η = 0.001"), damping terms (e.g., "ρ = 1"), data augmentation techniques (e.g., "Random Crop, Random Horizontal Flip, Auto Augment, and Cutout"), and loss functions ("cross-entropy loss with label smoothing").