reproducibilityindex.ai

Provably End-to-end Label-noise Learning without Anchor Points

Authors: Xuefeng Li, Tongliang Liu, Bo Han, Gang Niu, Masashi Sugiyama

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on benchmark datasets demonstrate the effectiveness and robustness of the proposed method.
Researcher Affiliation	Academia	1University of New South Wales 2Trustworthy Machine Learning Lab, University of Sydney 3Hong Kong Baptist University 4RIKEN AIP 5University of Tokyo
Pseudocode	No	The paper does not contain any clearly labeled pseudocode or algorithm blocks. The method is described in prose.
Open Source Code	No	The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets	Yes	We evaluate the proposed method on three synthetic noisy datasets, i.e., MNIST, CIFAR-10 and CIFAR100 and one real-world noisy dataset, i.e., clothing1M.
Dataset Splits	Yes	We leave out 10% of the training examples as the validation set. and we leave out 10% of the noisy training examples as a noisy validation set for model selection.
Hardware Specification	Yes	For a fair comparison, we implement all methods with default parameters by Py Torch on Tesla V100-SXM2.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with version details.
Experiment Setup	Yes	For MNIST, we use a Le Net-5 network. SGD is used to train the classiﬁcation network hθ with batch size 128, momentum 0.9, weight decay 103 and a learning rate 10 2. Adam with default parameters is used to train the transition matrix ˆT . The algorithm is run for 60 epoch. For CIFAR10, we use a Res Net-18 network. SGD is used to train both the classiﬁcation network hθ and the transition matrix ˆT with batch size 128, momentum 0.9, weight decay 103 and an initial learning rate 10 2. The algorithm is run for 150 epoch and the learning rate is divided by 10 after the 30th and 60th epoch. For CIFAR100, we use a Res Net-32 network. SGD is used to train the classiﬁcation network hθ with batch size 128, momentum 0.9, weight decay 103 and an initial learning rate 10 2. Adam with default parameters is used to train the transition matrix ˆT . The algorithm is run for 150 epoch and the learning rate is divided by 10 after the 30th and 60th epoch. For CIFAR-10 and CIFAR-100, we perform data augmentation by horizontal random ﬂips and 32 32 random crops after padding 4 pixels on each side. For clothing1M, we use a Res Net-50 pre-trained on Image Net. We only use the 1M noisy data to train and validate the network. For the optimization, SGD is used train both the classiﬁcation network hθ and the transition matrix ˆT with momentum 0.9, weight decay 103, batch size 32, and run with learning rates 2 103 and 2 105 for 5 epochs each.