reproducibilityindex.ai

Compressing Neural Networks with the Hashing Trick

Authors: Wenlin Chen, James Wilson, Stephen Tyree, Kilian Weinberger, Yixin Chen

ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experimental Results We conduct extensive experiments to evaluate Hashed Nets on eight benchmark datasets.
Researcher Affiliation	Collaboration	Wenlin Chen WENLINCHEN@WUSTL.EDU James T. Wilson J.WILSON@WUSTL.EDU Stephen Tyree STYREE@NVIDIA.COM Kilian Q. Weinberger KILIAN@WUSTL.EDU Yixin Chen CHEN@CSE.WUSTL.EDU Department of Computer Science and Engineering, Washington University in St. Louis, St. Louis, MO, USA NVIDIA, Santa Clara, CA, USA
Pseudocode	No	The paper describes the computational steps with mathematical equations but does not provide structured pseudocode or an algorithm block.
Open Source Code	No	The paper mentions using a third-party open-source implementation "xx Hash" but does not provide a statement or link for the source code of their own methodology.
Open Datasets	Yes	Datasets. Datasets consist of the original MNIST handwritten digit dataset, along with four challenging variants (Larochelle et al., 2007). Each variation amends the original through digit rotation (ROT), background superimposition (BG-RAND and BG-IMG), or a combination thereof (BG-IMG-ROT). In addition, we include two binary image classiﬁcation datasets: CONVEX and RECT (Larochelle et al., 2007).
Dataset Splits	Yes	Hyperparameters are selected for all algorithms with Bayesian optimization (Snoek et al., 2012) and hand tuning on 20% validation splits of the training sets.
Hardware Specification	Yes	Hashed Nets and all accompanying baselines were implemented using Torch7 (Collobert et al., 2011) and run on NVIDIA GTX TITAN graphics cards with 2688 cores and 6GB of global memory.
Software Dependencies	No	The paper mentions "Torch7" and "Bayesian Optimization MATLAB implementation bayesopt.m" but does not specify version numbers for these software dependencies.
Experiment Setup	Yes	Models are trained via stochastic gradient descent (minibatch size of 50) with dropout and momentum. Re LU is adopted as the activation function for all models. Hyperparameters are selected for all algorithms with Bayesian optimization (Snoek et al., 2012) and hand tuning on 20% validation splits of the training sets.