reproducibilityindex.ai

Learning Noise Transition Matrix from Only Noisy Labels via Total Variation Regularization

Authors: Yivan Zhang, Gang Niu, Masashi Sugiyama

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show the effectiveness of the proposed method through experiments on benchmark and real-world datasets.
Researcher Affiliation	Academia	1The University of Tokyo, Japan 2RIKEN AIP, Japan.
Pseudocode	No	The paper describes algorithmic steps in prose (e.g., Dirichlet posterior update) but does not include formal pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide an explicit statement or link indicating that its source code is open-source or publicly available.
Open Datasets	Yes	We evaluated our method on three image classiﬁcation datasets, namely MNIST (Le Cun et al., 1998), CIFAR10, and CIFAR-100 (Krizhevsky, 2009). We also evaluated our method on a real-world noisy label dataset, Clothing1M (Xiao et al., 2015).
Dataset Splits	No	The paper uses well-known datasets that have standard splits but does not explicitly state the training, validation, and test split percentages or sample counts within the text, nor does it cite a specific source for these predefined splits.
Hardware Specification	Yes	We implemented data-parallel distributed training on 64 NVIDIA Tesla P100 GPUs by Py Torch (Paszke et al., 2019).
Software Dependencies	No	The paper mentions 'Py Torch (Paszke et al., 2019)' but does not specify a version number for PyTorch or any other software libraries or dependencies, which is necessary for reproducibility.
Experiment Setup	Yes	For the gradient-based estimation, we initialized the unconstrained matrix with diagonal elements of log(0.5) and off-diagonal elements of log(0.5/(K 1)), so after applying softmax the diagonal elements are 0.5. For the Dirichlet posterior update method, we initialized the concentration matrix with diagonal elements of 10 for MNIST and 100 otherwise and off-diagonal elements of 0. We set β = (0.999, 0.01) and γ = 0.1. We sampled 512 (the same as the batch size) pairs in each batch to calculate the pairwise total variation distance. Other hyperparameters are provided in Appendix E.