Fluctuation-dissipation relations for stochastic gradient descent

Authors: Sho Yaida

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our claims are empirically verified.
Researcher Affiliation Industry Sho Yaida Facebook AI Research Facebook Inc. Menlo Park, California 94025, USA shoyaida@fb.com
Pseudocode No The paper describes algorithms and equations but does not include a clearly labeled "Pseudocode" or "Algorithm" block.
Open Source Code No No explicit statement about releasing source code or a link to a code repository was found.
Open Datasets Yes a multilayer perceptron (MLP) learning patterns in the MNIST training data (Le Cun et al., 1998) through SGD without momentum and a convolutional neural network (CNN) learning patterns in the CIFAR10 training data (Krizhevsky & Hinton, 2009)
Dataset Splits No The paper mentions training and test data but does not explicitly describe a validation dataset split or a methodology for it.
Hardware Specification No No specific hardware details such as GPU/CPU models, processors, or memory specifications used for running experiments were provided.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or specific library versions).
Experiment Setup Yes For both models, the mini-batch size is set to be |B| = 100, and the training data are shuffled at each epoch... the L2-regularization term 1/2λθ^2 with the weight decay λ = 0.01 is included in the loss function f. The MLP is initialized through the Xavier method (Glorot & Bengio, 2010) and trained for ˆttotal epoch = 100 epochs with the learning rate η = 0.1.