reproducibilityindex.ai

Training Your Sparse Neural Network Better with Any Mask

Authors: Ajay Kumar Jaiswal, Haoyu Ma, Tianlong Chen, Ying Ding, Zhangyang Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We report extensive experiments using variety of datasets, network architectures, and mask options. Incorporating our techniques in the sparse retraining immediately boosts the performance of sparse mask.
Researcher Affiliation	Academia	1The University of Texas at Austin 2University of California, Irvine.
Pseudocode	No	The paper does not contain explicitly labeled 'Pseudocode' or 'Algorithm' blocks.
Open Source Code	Yes	Code is at https://github.com/VITA-Group/To ST.
Open Datasets	Yes	By adopting our newly curated techniques, we demonstrate significant performance gains across various popular datasets (CIFAR-10, CIFAR-100, Tiny Image Net) and "Tiny Image Net (Deng et al., 2009)".
Dataset Splits	No	The paper mentions training on CIFAR-10, CIFAR-100, and Tiny Image Net, but does not explicitly provide the training/validation/test dataset splits (percentages or sample counts) within the text.
Hardware Specification	No	The paper does not explicitly provide specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies	No	The paper mentions using 'offical pytorch implementation' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup	Yes	For training, we adopt an SGD optimizer with momentum 0.9 and weight decay 2e 4. The initial learning rate is set to 0.1, and the networks are trained for 180 epochs with a batch size of 128. The learning rate decays by a factor of 10 at the 90th and 135th epoch during the training.