Learning Curves for Analysis of Deep Networks
Authors: Derek Hoiem, Tanmay Gupta, Zhizhong Li, Michal Shlapentokh-Rothman
ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments exemplify use of learning curves for analysis and yield several interesting observations. Experimentally, we find that the extended power law etest(n) = α + ηnγ yields a well-fitting learning curve... We validate our choice of learning curve model and estimation method in Sec. 4.1 and use the learning curves to explore impact of design decisions on error and data-reliance in Sec. 4.2. |
| Researcher Affiliation | Collaboration | 1University of Illinois at Urbana-Champaign 2PRIOR @ Allen Institute for AI. |
| Pseudocode | No | The paper does not contain any clearly labeled 'Pseudocode' or 'Algorithm' blocks. |
| Open Source Code | Yes | Code is currently available at prior.allenai.org/projects/lcurve. |
| Open Datasets | Yes | Tests are on Cifar100 (Krizhevsky, 2012), Cifar10, Places365 (Zhou et al., 2017), or Caltech101 (L. Fei-Fei; Fergus, 2006). |
| Dataset Splits | Yes | training data are split 80/20, with 80% of data used for training and validation and remaining 20% for testing. |
| Hardware Specification | No | The paper does not specify the exact hardware used for experiments (e.g., GPU model, CPU type, or memory specifications). |
| Software Dependencies | No | The paper mentions using the Ranger optimizer and its components (Rectified Adam, Look Ahead, Gradient Centralization) but does not provide specific version numbers for these software components or any other libraries (e.g., PyTorch, TensorFlow). |
| Experiment Setup | No | The paper describes general training strategies like 'initial learning rate that is selected' and 'learning rate schedule is set for each n based on validation', and states 'Other hyperparameters are fixed for all experiments', but it does not provide concrete values for hyperparameters such as the learning rate itself, batch size, or number of epochs. |