reproducibilityindex.ai

Follow the Moving Leader in Deep Learning

Authors: Shuai Zheng, James T. Kwok

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, experiments are performed on a number of deep learning models, including convolutional neural networks (Section 4.1), deep residual networks (Section 4.2), memory networks (Section 4.3), neural conversational model (Section 4.4), deep Q-network (Section 4.5), and long short-term memory (LSTM) (Section 4.6). A summary of the empirical performance of the various deep learning optimizers is presented in Section 4.7.
Researcher Affiliation	Academia	Shuai Zheng 1 James T. Kwok 1 1Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clear Water Bay, Hong Kong.
Pseudocode	Yes	Algorithm 1 Follow the Moving Leader (FTML).
Open Source Code	No	The paper references third-party implementations and libraries (Keras, Torch, specific GitHub repositories for ResNet, Neural Conversational Model, and Atari DQN) that were used, but does not provide specific access to the source code for their proposed FTML method.
Open Datasets	Yes	We use the example models on the MNIST and CIFAR-10 data sets from the Keras library... experiments on the single supporting fact task in the bAbI data set (Sukhbaatar et al., 2015; Weston et al., 2016)... its default data set Cornell Movie-Dialogs Corpus (with 50, 000 samples) (Danescu-Niculescu-Mizil & Lee, 2011)... Experiments are performed on two computer games on the Atari 2600 platform: Breakout and Asterix.
Dataset Splits	No	The paper mentions using various datasets and reports training loss, but it does not provide specific details on how these datasets were split into training, validation, or test sets (e.g., percentages, sample counts, or references to standard splits).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper mentions using the 'Keras library' and 'Torch implementation' but does not specify version numbers for these or any other software dependencies, making it difficult to reproduce the exact software environment.
Experiment Setup	Yes	For FTML, we set β1 = 0.6, β2 = 0.999, and a constant ϵt = ϵ = 10−8 for all t. For FTML, Adam, RMSprop, and NAG, η is selected by monitoring performance on the training set... The learning rate is chosen from {0.5, 0.25, 0.1, . . . , 0.00005, 0.000025, 0.00001}. Minibatches of sizes 128 and 32 are used for MNIST and CIFAR-10, respectively. A minibatch size of 32 is used (ResNet). We truncate backpropagation through time (BPTT) to 5 timesteps, and input 5 samples to the LSTM in each iteration.