On The Power of Curriculum Learning in Training Deep Networks

Authors: Guy Hacohen, Daphna Weinshall

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our empirical evaluation, both methods show similar benefits in terms of increased learning speed and improved final performance on test data. We address challenge (ii) by investigating different pacing functions to guide the sampling. The empirical investigation includes a variety of network architectures, using images from CIFAR-10, CIFAR-100 and subsets of Image Net. We conclude with a novel theoretical analysis of curriculum learning, where we show how it effectively modifies the optimization landscape.
Researcher Affiliation Academia 1School of Computer Science and Engineering, The Hebrew University of Jerusalem, Jerusalem 91904, Israel 2Edmond and Lily Safra Center for Brain Sciences, The Hebrew University of Jerusalem, Jerusalem 91904, Israel.
Pseudocode Yes Algorithm 1 Curriculum learning method
Open Source Code Yes All the code used in the paper is available at https://github.com/Guy Hacohen/curriculum learning
Open Datasets Yes The dataset is the small mammals super-class of CIFAR-100 (Krizhevsky & Hinton, 2009) a subset of 3000 images from CIFAR-100, divided into 5 classes. ... using images from CIFAR-10, CIFAR-100 and subsets of Image Net (Deng et al., 2009).
Dataset Splits Yes To avoid contamination of the conclusions, all results were cross-validated, wherein the hyper-parameters are chosen based on performance on a validation set before being used on the test set.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies No The paper mentions models like Inception network, VGG-based network, and Res Net, but does not provide specific version numbers for software dependencies or libraries used in the implementation.
Experiment Setup Yes Hyper-parameter tuning. As in all empirical studies involving deep learning, the results are quite sensitive to the values of the hyper-parameters, hence parameter tuning is required. ... fixed exponential pacing, varied exponential pacing and single step pacing define 3, 5 and 2 new hyper-parameters respectively, henceforth called the pacing hyper-parameters. ... we set an initial learning rate and decrease it exponentially every fixed number of iterations. This method gives rise to 3 learning rate hyperparameters which require tuning: (i) the initial learning rate; (ii) the factor by which the learning rate is decreased; (iii) the length of each step with constant learning rate.