reproducibilityindex.ai

Data Efficient Neural Scaling Law via Model Reusing

Authors: Peihao Wang, Rameswar Panda, Zhangyang Wang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our empirical study shows that model reusing can effectively reproduce the power law under the data scarcity regime.
Researcher Affiliation	Collaboration	1Department of Electrical and Computer Engineering, University of Texas at Austin, TX, United States 2MIT-IBM Watson Lab, MA, United States.
Pseudocode	No	The paper describes algorithms and methods in prose and equations but does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	Yes	We release our code at: https://github.com/VITA-Group/ Data-Efficient-Scaling.
Open Datasets	Yes	For BERT, we adopt the implementation provided by Tan & Bansal (2020), and choose English Wikipedia (Merity et al., 2016) as the training dataset. For Vi T, we utilize the implementation provided by Touvron et al. (2021a). The Image Net1k (Deng et al., 2009) dataset is chosen as our training data collection.
Dataset Splits	No	The paper describes the training datasets (English Wikipedia, ImageNet1k) and mentions evaluating on a "test set" or "test split" but does not explicitly provide details for a validation split or a complete train/validation/test split for reproducibility.
Hardware Specification	No	The paper mentions using "computational resources on the Ai MOS Supercomputer" but does not provide specific details such as GPU/CPU models, memory, or other hardware specifications used for experiments.
Software Dependencies	No	The paper mentions using implementations from other works (Tan & Bansal, 2020; Touvron et al., 2021a) but does not provide specific software dependencies with version numbers (e.g., PyTorch version, Python version).
Experiment Setup	Yes	For BERT, the batch size is 256, and learning rate is set to 2e-4, while for Ro BERTa, the used batch size is 1024 and learning rate is 8e-4.