Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Multicoated Supermasks Enhance Hidden Networks
Authors: Yasuyuki Okoshi, Ángel López Garcı́a-Arias, Kazutoshi Hirose, Kota Ando, Kazushi Kawamura, Thiem Van Chu, Masato Motomura, Jaehoon Yu
ICML 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on CIFAR-10 and Image Net show that Multicoated Supermasks enhance the tradeoff between accuracy and model size. |
| Researcher Affiliation | Academia | 1Tokyo Institute of Technology, Japan. |
| Pseudocode | No | The paper contains mathematical formulations and descriptions of the proposed method but does not include explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | Code available at: https://github.com/ yasu0001/multicoated-supermasks |
| Open Datasets | Yes | We evaluate Multicoated Supermasks for image classification using the CIFAR-10 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015) datasets. |
| Dataset Splits | Yes | We evaluate Multicoated Supermasks for image classification using the CIFAR-10 (Krizhevsky, 2009) and Image Net (Russakovsky et al., 2015) datasets. In CIFAR-10 experiments, the learning rate is decreased by 0.1 after 50 and 75 epochs starting from 0.1 with a batch size of 128; in Image Net experiments, the learning rate is reduced using cosine annealing starting from 0.1, with a batch size of 256. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications). |
| Software Dependencies | No | All models and experiments are implemented using MMClassification (MMClassification Contributors, 2020), a toolbox based on Py Torch (Paszke et al., 2019). |
| Experiment Setup | Yes | In both cases residual networks (He et al., 2016) are trained for 100 epochs using stochastic gradient descent (SGD) with weight decay of 0.0001 and momentum of 0.9. In CIFAR-10 experiments, the learning rate is decreased by 0.1 after 50 and 75 epochs starting from 0.1 with a batch size of 128; in Image Net experiments, the learning rate is reduced using cosine annealing starting from 0.1, with a batch size of 256. |