reproducibilityindex.ai

Re-parameterizing Your Optimizers rather than Architectures

Authors: Xiaohan Ding, Honghao Chen, Xiangyu Zhang, Kaiqi Huang, Jungong Han, Guiguang Ding

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 EXPERIMENTS
Researcher Affiliation	Collaboration	1Tencent AI Lab 2CRISE, Institute of Automation, Chinese Academy of Sciences 3School of Artiﬁcial Intelligence, University of Chinese Academy of Sciences 4MEGVII Technology 5Beijing Academy of Artiﬁcial Intelligence 6Department of Computer Science, the University of Shefﬁeld 7School of Software, BNRist, Tsinghua University
Pseudocode	No	The paper describes methods and processes but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	Code and models https://github.com/Ding Xiao H/Rep Optimizers.
Open Datasets	Yes	We use CIFAR-100 for searching the hyper-parameters of Rep Optimizers... For training Rep Opt-VGG and Rep VGG on Image Net
Dataset Splits	Yes	We report the accuracy on the validation set.
Hardware Specification	Yes	Max BS+1 would cause OOM (Out Of Memory) error on the 2080Ti GPU which has 11GB of memory. For the fair comparison, the training costs of all the models are tested with the same training script on the same machine with eight 2080Ti GPUs.
Software Dependencies	No	The paper mentions using PyTorch for quantization examples but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	Speciﬁcally, we use 8 GPUs, a batch size of 32 per GPU, input resolution of 224 224, and a learning rate schedule with 5-epoch warm-up, initial value of 0.1 and cosine annealing for 120 epochs. For the data augmentation, we use a pipeline of random cropping, left-right ﬂipping and Rand Augment (Cubuk et al., 2020). We also use a label smoothing coefﬁcient of 0.1. The regular SGD optimizers for the baseline models and the Rep Optimizers for Rep Opt-VGG use momentum of 0.9 and weight decay of 4 10 5.