Automated Curriculum Learning for Neural Networks

Authors: Alex Graves, Marc G. Bellemare, Jacob Menick, Rémi Munos, Koray Kavukcuoglu

ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results for LSTM networks on three curricula demonstrate that our approach can significantly accelerate learning, in some cases halving the time required to attain a satisfactory performance level.
Researcher Affiliation Industry Alex Graves 1 Marc G. Bellemare 1 Jacob Menick 1 R emi Munos 1 Koray Kavukcuoglu 1 1Deep Mind, London, UK. Correspondence to: Alex Graves <gravesa@google.com>.
Pseudocode Yes Algorithm 1 Automated Curriculum Learning
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes For our first experiment, we trained character-level Kneser Ney n-gram models (Kneser and Ney, 1995) on the King James Bible data from the Canterbury corpus (Arnold and Bell, 1997)... We devised a curriculum with both the sequence length and the number of repeats varying from 1 to 13... The b Ab I dataset (Weston et al., 2015) consists of 20 synthetic question-answering problems...
Dataset Splits No The paper mentions evaluating performance on 'independent samples not used for training or reward calculation' and for bAbI tasks, 'training and evaluation set performance were indistinguishable'. However, it does not specify explicit training, validation, and test splits (e.g., percentages or counts) or reference predefined validation splits.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions using RMSProp with momentum as an optimizer, but it does not specify any software libraries or frameworks with version numbers (e.g., TensorFlow 2.x, PyTorch 1.x, Python 3.x).
Experiment Setup Yes The LSTM network had two layers of 512 cells, and the batch size was 32... The parameters for the Exp3.S algorithm were η = 10 3, β = 0, ϵ = 0.05... using a momentum of 0.9 and a learning rate of 10 5 unless specified otherwise.