Robust Loss Functions under Label Noise for Deep Neural Networks

Authors: Aritra Ghosh, Himanshu Kumar, P. S. Sastry

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments, we illustrate the robustness of risk minimization with such loss functions for learning neural networks.
Researcher Affiliation Collaboration Aritra Ghosh arghosh@microsoft.com Microsoft, Bangalore Himanshu Kumar himanshukr@ee.iisc.ernet.in Indian Institute of Science, Bangalore P. S. Sastry sastry@ee.iisc.ernet.in Indian Institute of Science, Bangalore
Pseudocode No The paper describes procedures and algorithms but does not include structured pseudocode or algorithm blocks.
Open Source Code No The paper does not provide a concrete access link or explicit statement about the release of its source code for the methodology described.
Open Datasets Yes The specific image and text data sets used are shown in Table 1. [...] MNIST (60k, 10k, 10, 28 28), CIFAR 10 (50k, 10k, 10, 3 32 32) (Krizhevsky and Geoffrey 2009), Reuters RCV1 (213k, 213k, 50, 2000)(Lewis et al. 2004), Imdb Sentiment (20k, 5k, 2, 5k) (Maas et al. 2011)
Dataset Splits Yes The specific image and text data sets used are shown in Table 1. In the table, for each data set, we mention size of training and test sets (ntr, nte).
Hardware Specification No The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions using 'stochastic gradient descent through backpropagation (Bergstra et al. 2010; Chollet 2015)' which refer to Theano and Keras respectively, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes All networks used Rectified Linear Unit (Re LU) in the hidden layers and have softmax layer at the output... All networks are trained through backpropagation with momentum term and weight decay. We have also used dropout regularization and the dropout rates are also shown in Table 1.