reproducibilityindex.ai

Validating the Lottery Ticket Hypothesis with Inertial Manifold Theory

Authors: Zeru Zhang, Jiayin Jin, Zijie Zhang, Yang Zhou, Xin Zhao, Jiaxiang Ren, Ji Liu, Lingfei Wu, Ruoming Jin, Dejing Dou

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive evaluation on real datasets demonstrates the superior performance of our proposed IMC model against several state-of-the-art neural network pruning and LTH methods. More experiments, implementation details, and hyperparameter setting are presented in Appendices A.3-A.5.
Researcher Affiliation	Collaboration	1Auburn University, 2Baidu Research, 3JD.COM Silicon Valley Research Center, 4Kent State University, 5University of Oregon
Pseudocode	Yes	The following are the algorithm descriptions of our Inertial Manifold-based neural network Compression (IMC) method step by step: (1) Given a dense neural network f(x; W) with randomlyinitialized ﬂattened weight parameters W = W0 Rd, when optimizing W with stochastic gradient descent (SGD) on a training set, we generate an approximate W in few training iterations (10 iterations in our implementation), where W is a local minimum point of loss function L regarding W in Eq.(6) (i.e., L(W ) = 0). ... and (6) We prune the original network with high-dimensional parameters W Rd to generate a subnetwork with low-dimensional parameters W + Rd k, i.e., reduce W to W +, and train W + until convergence.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-sourcing of its code.
Open Datasets	Yes	In this section, we have evaluated the effectiveness of our IMC model and other baselines for neural network pruning over three standard image classiﬁcation datasets: CIFAR-10 [28], CIFAR-100 [28], and Image Net [10].
Dataset Splits	No	The paper mentions training and testing on datasets like CIFAR-10, CIFAR-100, and Image Net, and states 'The experiments exactly follow the same settings described by the original LTH paper [13, 14] and other following works on LTH and network pruning', but does not explicitly provide specific percentages or sample counts for train/validation/test splits within the paper.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for the experiments (e.g., GPU models, CPU types, or cloud resources with specifications).
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the implementation.
Experiment Setup	Yes	More experiments, implementation details, and hyperparameter setting are presented in Appendices A.3-A.5. ... (1) Given a dense neural network f(x; W) with randomlyinitialized ﬂattened weight parameters W = W0 Rd, when optimizing W with stochastic gradient descent (SGD) on a training set, we generate an approximate W in few training iterations (10 iterations in our implementation), where W is a local minimum point of loss function L regarding W in Eq.(6) (i.e., L(W ) = 0). ... In our implementation, we choose p = 2 and = 0.01. ... The experiments exactly follow the same settings described by the original LTH paper [13, 14] and other following works on LTH and network pruning [70, 59, 15, 53, 69, 11].