reproducibilityindex.ai

Multicoated Supermasks Enhance Hidden Networks

Authors: Yasuyuki Okoshi, Ángel López Garcı́a-Arias, Kazutoshi Hirose, Kota Ando, Kazushi Kawamura, Thiem Van Chu, Masato Motomura, Jaehoon Yu

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on CIFAR-10 and Image Net show that Multicoated Supermasks enhance the tradeoff between accuracy and model size.
Researcher Affiliation	Academia	1Tokyo Institute of Technology, Japan.
Pseudocode	No	The paper contains mathematical formulations and descriptions of the proposed method but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code available at: https://github.com/ yasu0001/multicoated-supermasks
Open Datasets	Yes	We evaluate Multicoated Supermasks for image classification using the CIFAR-10 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015) datasets.
Dataset Splits	Yes	We evaluate Multicoated Supermasks for image classification using the CIFAR-10 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015) datasets. In CIFAR-10 experiments, the learning rate is decreased by 0.1 after 50 and 75 epochs starting from 0.1 with a batch size of 128; in Image Net experiments, the learning rate is reduced using cosine annealing starting from 0.1, with a batch size of 256.
Hardware Specification	No	The paper does not explicitly describe the specific hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications).
Software Dependencies	No	All models and experiments are implemented using MMClassification (MMClassification Contributors, 2020), a toolbox based on Py Torch (Paszke et al., 2019).
Experiment Setup	Yes	In both cases residual networks (He et al., 2016) are trained for 100 epochs using stochastic gradient descent (SGD) with weight decay of 0.0001 and momentum of 0.9. In CIFAR-10 experiments, the learning rate is decreased by 0.1 after 50 and 75 epochs starting from 0.1 with a batch size of 128; in Image Net experiments, the learning rate is reduced using cosine annealing starting from 0.1, with a batch size of 256.