Learning Curves for Analysis of Deep Networks

Authors: Derek Hoiem, Tanmay Gupta, Zhizhong Li, Michal Shlapentokh-Rothman

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments exemplify use of learning curves for analysis and yield several interesting observations. Experimentally, we find that the extended power law etest(n) = α + ηnγ yields a well-fitting learning curve... We validate our choice of learning curve model and estimation method in Sec. 4.1 and use the learning curves to explore impact of design decisions on error and data-reliance in Sec. 4.2.
Researcher Affiliation Collaboration 1University of Illinois at Urbana-Champaign 2PRIOR @ Allen Institute for AI.
Pseudocode No The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code Yes Code is currently available at prior.allenai.org/projects/lcurve.
Open Datasets Yes Tests are on Cifar100 (Krizhevsky, 2012), Cifar10, Places365 (Zhou et al., 2017), or Caltech101 (L. Fei-Fei; Fergus, 2006).
Dataset Splits Yes training data are split 80/20, with 80% of data used for training and validation and remaining 20% for testing.
Hardware Specification No The paper does not specify the exact hardware used for experiments (e.g., GPU model, CPU type, or memory specifications).
Software Dependencies No The paper mentions using the Ranger optimizer and its components (Rectified Adam, Look Ahead, Gradient Centralization) but does not provide specific version numbers for these software components or any other libraries (e.g., PyTorch, TensorFlow).
Experiment Setup No The paper describes general training strategies like 'initial learning rate that is selected' and 'learning rate schedule is set for each n based on validation', and states 'Other hyperparameters are fixed for all experiments', but it does not provide concrete values for hyperparameters such as the learning rate itself, batch size, or number of epochs.