reproducibilityindex.ai

Can Cross Entropy Loss Be Robust to Label Noise?

Authors: Lei Feng, Senlin Shu, Zhuoyi Lin, Fengmao Lv, Li Li, Bo An

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experimental results on benchmark datasets demonstrate that our proposed approach signiﬁcantly outperforms the state-of-the-art counterparts.
Researcher Affiliation	Academia	Lei Feng1 , Senlin Shu2 , Zhuoyi Lin1 , Fengmao Lv3 , Li Li2 , Bo An1 1School of Computer Science and Engineering, Nanyang Technological University, Singapore 2College of Computer and Information Science, Southwest University, Chongqing, China 3Center of Statistical Research, Southwestern University of Finance and Economics, China
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any statements or links regarding the availability of open-source code for the described methodology.
Open Datasets	Yes	Our experiments are conducted on MNIST [Le Cun et al., 1998], Fashion-MNIST (Fashion in short) [Xiao et al., 2017], Kuzushiji-MNIST (Kuzushiji in short) [Clanuwat et al., 2018], CIFAR-10 [Krizhevsky et al., 2009] and CIFAR-100 [Krizhevsky et al., 2009]
Dataset Splits	No	The paper mentions training and testing sets, but does not explicitly describe a validation set, its size, or the splitting methodology for it.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU specifications).
Software Dependencies	No	The paper mentions using the Adam optimizer and specific deep learning models (Le Net-5, Res Net-34), but does not list any specific software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	For all the methods, learning rate is selected from {10^-2, 10^-3, 10^-4, 10^-5}. ...all networks are trained using the Adam optimizer [Kingma and Ba, 2014] with the number of epochs set to 200 and the batch size set to 256. ...On the three datasets, networks are trained with weight decay of 10^-4. ...On the two datasets, networks are trained with weight decay of 0. ...GCE [Zhang and Sabuncu, 2018]: ...q is set to 0.7... PHuber-CE: ...τ is selected from {2, 10}. ...TCE: ...t is selected from {2, . . . , 6}.