reproducibilityindex.ai

Tunable Sensitivity to Large Errors in Neural Network Training

Authors: Gil Keren, Sivan Sabato, Bjrn Schuller

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We tested our method on several benchmark datasets. We propose, and corroborate in our experiments, that the optimal level of sensitivity to hard example is positively correlated with the depth of the network. Moreover, the test prediction error obtained by our method is generally lower than that of the vanilla cross-entropy gradient learner.
Researcher Affiliation	Academia	Gil Keren Chair of Complex and Intelligent Systems, University of Passau Passau, Germany; Sivan Sabato Department of Computer Science Ben-Gurion University of the Negev Beer Sheva, Israel; Bj orn Schuller Chair of Complex and Intelligent Systems, University of Passau Passau, Germany, Machine Learning Group Imperial College London, U.K.
Pseudocode	No	The paper describes its methods mathematically and textually but does not include any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements about open-source code release or links to code repositories for the described methodology.
Open Datasets	Yes	For our experiments, we used four classiﬁcation benchmark datasets from the ﬁeld of computer vision: The MNIST dataset (Le Cun et al. 1998), the Street View House Numbers dataset (SVHN) (Netzer et al. 2011) and the CIFAR-10 and CIFAR-100 datasets (Krizhevsky and Hinton 2009).
Dataset Splits	Yes	We generated 30,000 examples for each of the training, validation and test datasets.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions general algorithms like 'stochastic gradient descent with momentum' but does not specify any software names with version numbers (e.g., libraries, frameworks, or programming languages with versions) used for implementation.
Experiment Setup	Yes	We used batch gradient descent with a learning rate of 0.01 for optimization of the four parameters, where the gradient is replaced with the pseudo-gradient