reproducibilityindex.ai

Stochastic Loss Function

Authors: Qingliang Liu, Jinmei Lai4884-4891

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on a variety of popular datasets strongly demonstrate that SLF is capable of obtaining appropriate gradients at different stages during training, and can signiﬁcantly improve the performance of various deep models on real world tasks including classiﬁcation, clustering, regression, neural machine translation, and objection detection.
Researcher Affiliation	Academia	Qingliang Liu, Jinmei Lai State Key Lab of ASIC and System, School of Microelectronics, Fudan University, Shanghai, China {qlliu17, jmlai}@fudan.edu.cn
Pseudocode	Yes	Algorithm 1 Stochastic Loss Function Require: Dataset D = {(xi, yi)}i=1, loss functions L = {ℓi}n i=1, networks f( ; w) and h( ; v), sampling times K. Ensure: Trained main network f( ; w ). 1: Randomly initialize parameters w in the main network f( ; w) and v in the decision network h( ; v); 2: for number of training iterations do 3: for (x, y) D do 4: p = f(x; w); the estimated output of the main network with the parameter w 5: for each time step k = 1, 2, , K do 6: ˆpk = G(h(p; v)); selecting loss functions with the decision network and Gumbel Softmax 7: end for 8: ˆp = 1 K K k=1 ˆpk; weighting and combing the loss functions according to Eq. (10) 9: H(L, w, v) = ℓi L ˆpi ℓi(f(x; w), y); computing the loss of our SLF according to Eq. (11) 10: Update w and v by minimizing Eq. (11); updating the parameters w and v with the standard back-propagation 11: end for 12: end for
Open Source Code	No	The paper does not provide any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	To validate its capability, we carry out a series of image classiﬁcation tasks on three frequently-used datasets, including MNIST, CIFAR-10, and CIFAR-100. For each dataset, several popular deep neural networks are employed to demonstrate the capability.
Dataset Splits	No	The paper mentions using a 'validation set' in the introduction for dynamic adjustment of gradients, but it does not specify the actual data splitting percentages or counts for training, validation, and testing sets needed for reproduction.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments, only general statements about deep networks.
Software Dependencies	No	The paper does not specify any software dependencies with version numbers, such as programming languages, libraries, or frameworks used for implementation.
Experiment Setup	Yes	The hyper-parameters in our experiments are set as follows. In each experiment, our SLF model always inherits the same setting in the compared baslines, including network architectures (e.g., activation functions, initilizations, bach sizes, etc.), learning rates, and the optimizors, except of the loss functions.