reproducibilityindex.ai

Beyond neural scaling laws: beating power law scaling via data pruning

Authors: Ben Sorscher, Robert Geirhos, Shashank Shekhar, Surya Ganguli, Ari Morcos

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then test this improved scaling prediction with pruned dataset size empirically, and indeed observe better than power law scaling in practice on Res Nets trained on CIFAR-10, SVHN, and Image Net.
Researcher Affiliation	Collaboration	Ben Sorscher 1 Robert Geirhos 2 Shashank Shekhar3 Surya Ganguli1,3 Ari S. Morcos3 equal contribution 1Department of Applied Physics, Stanford University 2University of Tübingen 3Meta AI (FAIR)
Pseudocode	No	The provided text does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	Yes	All code for theory plots and numerical perceptron simulations is packaged in a reproducible colab notebook.
Open Datasets	Yes	We then test this improved scaling prediction with pruned dataset size empirically, and indeed observe better than power law scaling in practice on Res Nets trained on CIFAR-10, SVHN, and Image Net. ... Vision Transformers fine-tuned on CIFAR-10.
Dataset Splits	No	The paper refers to 'App. B for pruning and training details' which may contain split information, but this is not provided in the main text. Figure 5B mentions 'top-5 validation accuracy', implying a validation set was used, but its split details are not explicit.
Hardware Specification	No	The paper mentions that the 'total amount of compute and the type of resources used' are included (as per checklist 3d), but these details are not present in the provided text.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers in the provided text.
Experiment Setup	No	The paper mentions models like 'Res Net18' and datasets like 'CIFAR-10', and refers to 'App. B for all pruning/training details', but it does not specify explicit hyperparameters (e.g., learning rate, batch size) or other system-level training settings in the provided text.