Acceleration via Fractal Learning Rate Schedules

Authors: Naman Agarwal, Surbhi Goel, Cyril Zhang

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide some experiments to challenge conventional beliefs about stable learning rates in deep learning: the fractal schedule enables training to converge with locally unstable updates which make negative progress on the objective.
Researcher Affiliation Industry 1Google AI Princeton, Princeton, NJ, USA 2Microsoft Research, New York, NY, USA. Correspondence to: Cyril Zhang <cyrilzhang@microsoft.com>.
Pseudocode No The paper defines constructions and outlines procedures but does not contain a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code Yes As an invitation to try these ideas in various experimental settings, we provide in Appendix A some Python code to generate Chebyshev learning rates and fractal schedules.
Open Datasets Yes Figure 5 shows training curves for logistic regression for MNIST classification; details are in Appendix F.3. ... Figure 6: Res Net-18/CIFAR-10 training with batch size 8192 and a repeated T = 8 fractal Chebyshev schedule.
Dataset Splits No The paper mentions using MNIST and CIFAR-10 datasets, which typically have standard splits, but it does not explicitly provide specific percentages, sample counts, or detailed splitting methodologies for training, validation, or test sets within the paper text.
Hardware Specification No The paper does not explicitly describe the specific hardware used, such as GPU or CPU models, for running its experiments.
Software Dependencies No The paper mentions 'Python code' in Appendix A but does not provide specific version numbers for Python or any other key software components, libraries, or solvers used in the experiments.
Experiment Setup Yes Figure 6: Res Net-18/CIFAR-10 training with batch size 8192 and a repeated T = 8 fractal Chebyshev schedule.