Dual Lottery Ticket Hypothesis
Authors: Yue Bai, Huan Wang, ZHIQIANG TAO, Kunpeng Li, Yun Fu
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on several public datasets and comparisons with competitive approaches validate our DLTH as well as the effectiveness of the proposed model RST. |
| Researcher Affiliation | Collaboration | 1Northeastern University, Boston, MA, USA 2Santa Clara University, Santa Clara, CA, USA 3Meta Research, Burlingame, CA, USA |
| Pseudocode | No | The paper describes the Random Sparse Network Transformation (RST) in text but does not provide any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/yueb17/DLTH. |
| Open Datasets | Yes | Experiments are based on Res Net56/Res Net18 He et al. (2016) on CIFAR10/CIFAR100 Krizhevsky et al. (2009), and a Image Net subset Deng et al. (2009) to compare our method with Lottery Ticket Hypothesis (LTH) Frankle & Carbin (2018) and other strong baselines. |
| Dataset Splits | Yes | Experiments are based on Res Net56/Res Net18 He et al. (2016) on CIFAR10/CIFAR100 Krizhevsky et al. (2009), and a Image Net subset Deng et al. (2009). Total number of epochs is 200 with 0.1/0.01/0.001 learning rates starting at 0/100/150 epochs, respectively. |
| Hardware Specification | Yes | We use 4 NVIDIA Titan XP GPUs to perform our experimental evaluations. |
| Software Dependencies | No | The paper mentions optimization by SGD but does not specify software dependencies like Python, PyTorch, or CUDA with version numbers. |
| Experiment Setup | Yes | Experiments on CIFAR10/CIFAR100 are optimized by SGD with 0.9 momentum and 5e-4 weight decay using 128 batch size. Total number of epochs is 200 with 0.1/0.01/0.001 learning rates starting at 0/100/150 epochs, respectively. |