Robust Loss Functions under Label Noise for Deep Neural Networks
Authors: Aritra Ghosh, Himanshu Kumar, P. S. Sastry
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments, we illustrate the robustness of risk minimization with such loss functions for learning neural networks. |
| Researcher Affiliation | Collaboration | Aritra Ghosh arghosh@microsoft.com Microsoft, Bangalore Himanshu Kumar himanshukr@ee.iisc.ernet.in Indian Institute of Science, Bangalore P. S. Sastry sastry@ee.iisc.ernet.in Indian Institute of Science, Bangalore |
| Pseudocode | No | The paper describes procedures and algorithms but does not include structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide a concrete access link or explicit statement about the release of its source code for the methodology described. |
| Open Datasets | Yes | The specific image and text data sets used are shown in Table 1. [...] MNIST (60k, 10k, 10, 28 28), CIFAR 10 (50k, 10k, 10, 3 32 32) (Krizhevsky and Geoffrey 2009), Reuters RCV1 (213k, 213k, 50, 2000)(Lewis et al. 2004), Imdb Sentiment (20k, 5k, 2, 5k) (Maas et al. 2011) |
| Dataset Splits | Yes | The specific image and text data sets used are shown in Table 1. In the table, for each data set, we mention size of training and test sets (ntr, nte). |
| Hardware Specification | No | The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. |
| Software Dependencies | No | The paper mentions using 'stochastic gradient descent through backpropagation (Bergstra et al. 2010; Chollet 2015)' which refer to Theano and Keras respectively, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | All networks used Rectified Linear Unit (Re LU) in the hidden layers and have softmax layer at the output... All networks are trained through backpropagation with momentum term and weight decay. We have also used dropout regularization and the dropout rates are also shown in Table 1. |