reproducibilityindex.ai

Self-Adaptive Training: beyond Empirical Risk Minimization

Authors: Lang Huang, Chao Zhang, Hongyang Zhang

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the CIFAR and Image Net datasets verify the effectiveness of our approach in two applications: classiﬁcation with label noise and selective classiﬁcation.
Researcher Affiliation	Academia	Lang Huang Peking University laynehuang@pku.edu.cn Chao Zhang Peking University c.zhang@pku.edu.cn Hongyang Zhang TTIC hongyanz@ttic.edu
Pseudocode	Yes	Algorithm 1 Self-Adaptive Training
Open Source Code	Yes	The code is available at https://github.com/Layne H/self-adaptive-training.
Open Datasets	Yes	We conduct the experiments on the CIFAR10 and CIFAR100 datasets [18]... on the Image Net under both standard setup (i.e., using original labels) and the case that 40% training labels are corrupted.
Dataset Splits	Yes	In this section, we conduct the experiments on the CIFAR10 dataset [18], of which we split the original training data into a training set (consists of ﬁrst 45,000 data pairs) and a validation set (consists of last 5,000 data pairs).
Hardware Specification	No	The paper does not provide specific hardware details (like CPU/GPU models or cloud instance types) used for running its experiments.
Software Dependencies	No	The paper mentions "implemented on Py Torch [28]" but does not specify a version number for PyTorch or any other software dependencies.
Experiment Setup	Yes	The networks are implemented on Py Torch [28] and optimized using SGD with initial learning rate of 0.1, momentum of 0.9, weight decay of 0.0005, batch size of 256, total training epochs of 200. The learning rate is decayed to zero using cosine annealing schedule [21]. We use data augmentation of random horizontal ﬂipping and cropping. We ﬁx the hyper-parameters Es = 60, α = 0.9 by default if not speciﬁed. ... We set the initial learning rate as 0.1 and decay it by a factor of 0.1 in epochs 75 and 90, respectively. We choose 1/λ = 6.0 as suggested by [48] and use Es = 70, α = 0.9 for our approach.