reproducibilityindex.ai

Sparse Flows: Pruning Continuous-depth Models

Authors: Lucas Liebenwein, Ramin Hasani, Alexander Amini, Daniela Rus

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform a diverse set of experiments demonstrating the effect of pruning on the generalization capability of continuous-depth models.
Researcher Affiliation	Academia	Lucas Liebenwein MIT CSAIL lucas@csail.mit.edu Ramin Hasani MIT CSAIL rhasani@mit.edu Alexander Amini MIT CSAIL amini@mit.edu Daniela Rus MIT CSAIL rus@csail.mit.edu
Pseudocode	Yes	Algorithm 1 SPARSEFLOW(f, Φtrain, PR, e) Input: f: neural ODE model with parameter set θ; Φtrain: hyper-parameters for training; PR: relative prune ratio; e: number of training epochs per prune-cycle. Output: f( ; ˆθ): Sparse Flow; m: sparse connection pattern. 1: θ0 RANDOMINIT() 2: θ TRAIN(θ0, Φtrain, e) Initial training stage with dense neural ODE ( warm start ). 3: m 1\|θ0\| Initialize binary mask indicating neural connection pattern. 4: while validation loss of Sparse Flow decreases do 5: m PRUNE(m θ, PR) Prune PR% of the remaining parameters and update mask. 6: θ TRAIN(m θ, Φtrain, e) Restart training with updated connection pattern. 7: end while 8: ˆθ m θ, and return f( ; ˆθ), m
Open Source Code	Yes	Code: https://github.com/lucaslie/torchprune
Open Datasets	Yes	We scale our experiments to a set of ﬁve real-world tabular datasets (prepared based on the instructions given by Papamakarios et al. (2017) and Grathwohl et al. (2019)) to verify our empirical observations about the effect of pruning on the generalizability of continuous normalizing ﬂows.
Dataset Splits	No	Subsequently, we proceed by iteratively pruning and retraining the network until we either obtain the desired level of sparsity, i.e., prune ratio or when the loss for a pre-speciﬁed hold-out dataset (validation loss) starts to deteriorate (early stopping).
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory) are provided for the experimental setup in the paper.
Software Dependencies	No	We used two code bases (FFJORD from Grathwohl et al. (2019) and Torch Dyn (Poli et al., 2020a)) over which we implemented our pruning framework.
Experiment Setup	No	We use Adam with a ﬁxed step learning decay schedule and weight decay in some instances.