REVISITING PRUNING AT INITIALIZATION THROUGH THE LENS OF RAMANUJAN GRAPH
Authors: Duc N.M Hoang, Shiwei Liu, Radu Marculescu, Zhangyang Wang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we examine our earlier claims with empirical results and see how they fare; furthermore, with supporting evidence, we answer questions regarding relationships between Ramanujan to performance, randomness to performance, and Ramanujan to randomness. Finally, we point out intuitions and what they imply for Pa I under the lens of the Ramanujan perspective. Experimental settings. We conduct our experiments with two different DNN architectures: Resnet34 (He et al., 2016) and Vgg-16 (Simonyan & Zisserman, 2014) on CIFAR-10 (Krizhevsky, 2009). |
| Researcher Affiliation | Academia | Duc Hoang, Shiwei Liu, Radu Marculescu & Zhangyang Wang Department of Electrical and Computer Engineering University of Texas at Austin, Austin, TX 78712, USA {hoangd,radum,atlaswang}@utexas.edu shiwei.liu@austin.utexas.edu |
| Pseudocode | No | The paper does not contain any explicit pseudocode blocks or algorithms. |
| Open Source Code | Yes | Our code is available at: https://github.com/VITA-Group/ramanujan-on-pai. |
| Open Datasets | Yes | Experimental settings. We conduct our experiments with two different DNN architectures: Resnet34 (He et al., 2016) and Vgg-16 (Simonyan & Zisserman, 2014) on CIFAR-10 (Krizhevsky, 2009). We include additional results on CIFAR-100 in the Appendix. |
| Dataset Splits | No | The paper mentions using CIFAR-10 and CIFAR-100, which have standard train/test splits, but does not explicitly describe how data was partitioned for training, validation, and testing (e.g., 80/10/10 split or specific sample counts for each set) nor does it mention a validation set explicitly being used. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used, such as GPU/CPU models, memory specifications, or cloud computing instances. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers. |
| Experiment Setup | Yes | Table 1: Summary of architectures and hyperparameters that we study in this paper. Model Data #Epoch Batch Size Optimizer LR LR Decay, Epoch Weight Decay Resnet-34 CIFAR-10 250 256 SGD 0.1 10 , [160, 180] 0.0005 VGG-16 CIFAR-10 250 256 SGD 0.1 10 , [160, 180] 0.0005 |