reproducibilityindex.ai

A Theoretical View on Sparsely Activated Networks

Authors: Cenk Baykal, Nishanth Dikkala, Rina Panigrahy, Cyrus Rashtchian, Xin Wang

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To support our theory, we perform experiments in Section 5 on approximating Lipschitz functions with sparse networks. We identify several synthetic datasets where models with data-dependent sparse layers outperform dense models of the same size. Moreover, we achieve these results with relatively small networks.
Researcher Affiliation	Industry	Cenk Baykal Google Research Nishanth Dikkala Google Research Rina Panigrahy Google Research Cyrus Rashtchian Google Research Xin Wang Google Research
Pseudocode	No	The paper describes models and algorithms in text and mathematical formulations but does not include structured pseudocode blocks or algorithm listings.
Open Source Code	Yes	We include details to reproduce all datasets and experiments.
Open Datasets	Yes	On CIFAR-10, we also see that the DSM model performs comparably or better than the dense network.
Dataset Splits	No	The paper mentions training on CIFAR-10 and evaluating on the test dataset, but it does not specify the train/validation/test split percentages or sample counts.
Hardware Specification	No	The paper does not specify any particular hardware (e.g., GPU, CPU models, or specific cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions using ADAM optimizer, but does not provide specific software names with version numbers for reproducibility (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	Both models are trained with ADAM optimizer for 50 epochs and evaluated on the test dataset for model accuracy with no data augmentation