Sparsity in Continuous-Depth Neural Networks
Authors: Hananeh Aliee, Till Richter, Mikhail Solonin, Ignacio Ibarra, Fabian Theis, Niki Kilbertus
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive empirical evaluation on these challenging benchmarks suggests that weight sparsity improves generalization in the presence of noise or irregular sampling. However, it does not prevent learning spurious feature dependencies in the inferred dynamics, rendering them impractical for predictions under interventions, or for inferring the true underlying dynamics. Instead, feature sparsity can indeed help with recovering sparse ground-truth dynamics compared to unregularized NODEs. |
| Researcher Affiliation | Collaboration | Hananeh Aliee Helmholtz Munich Till Richter Helmholtz Munich Mikhail Solonin Technical University of Munich Ignacio Ibarra Helmholtz Munich Fabian Theis Technical University of Munich Helmholtz Munich Niki Kilbertus Technical University of Munich Helmholtz AI, Munich {hananeh.aliee,till.richter,ignacio.ibarra,fabian.theis,niki.kilbertus} @helmholtz-muenchen.de Work done while at TUM. MS is currently employed by J.P. Morgan Chase & Co.; mikhail.solonin@jpmorgan.com |
| Pseudocode | No | The paper does not contain any explicit pseudocode blocks or sections labeled “Algorithm”. |
| Open Source Code | Yes | 2The python implementation is available at: https://github.com/theislab/Path Reg |
| Open Datasets | Yes | We curate large, real-world datasets consisting of human motion capture (mocap.cs.cmu.edu) as well as human hematopoiesis single-cell RNA-seq [32] data for our empirical evaluations. |
| Dataset Splits | No | The paper mentions training and testing data but does not explicitly specify a separate validation dataset split or how hyperparameters were tuned using a validation set. Section A states: “The evaluation of models are done on the test set, after models are trained,” implying direct training on a portion and testing on another, without a dedicated validation split for hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, processor types, or memory amounts. While the ethics statement confirms that compute resources were included, it only vaguely refers to “type of GPUs, internal cluster, or cloud provider” without any specific models or configurations. |
| Software Dependencies | No | The paper mentions using “Adam optimizer [23]” and that “The python implementation is available at” (footnote 2). However, it does not specify version numbers for Python, Adam, or any other software libraries or dependencies (e.g., PyTorch, TensorFlow, scikit-learn), which are crucial for reproducibility. |
| Experiment Setup | Yes | A detailed description of our training procedures and architecture choices for each experiment is provided in Appendix A. We train all models for 500 epochs using the Adam optimizer [23] with a learning rate of 1e−2 and weight decay of 1e−5. We use a batch size of 20. |