The Local Elasticity of Neural Networks

Authors: Hangfeng He, Weijie Su

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This phenomenon is shown to persist for neural networks with nonlinear activation functions through extensive simulations on real-life and synthetic datasets... The effectiveness of the clustering algorithm on the MNIST and CIFAR-10 datasets in turn corroborates the hypothesis of local elasticity of neural networks on real-life data.
Researcher Affiliation Academia Hangfeng He & Weijie J. Su University of Pennsylvania Philadelphia, PA hangfeng@seas.upenn.edu, suw@wharton.upenn.edu
Pseudocode Yes Algorithm 1 The Local Elasticity Based Clustering Algorithm.
Open Source Code No The paper does not provide concrete access to its own source code. It only references a third-party CNN architecture at 'https://github.com/pytorch/examples/blob/master/mnist/main.py'.
Open Datasets Yes We evaluate the performance of Algorithm 1 on MNIST (Le Cun, 1998) and CIFAR-10 (Krizhevsky, 2009). Res Net-152 is pre-trained on Image Net (Deng et al., 2009)...
Dataset Splits No The paper describes how primary and auxiliary datasets are constructed by random sampling (e.g., 'randomly sampling a total of 1000 examples equally from the two classes in the pair'), but it does not specify explicit training, validation, and test dataset splits with percentages or counts.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU specifications, or cloud computing resources used for running its experiments.
Software Dependencies No The paper mentions using a CNN architecture from a PyTorch example (implied PyTorch usage), but it does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes We use 40960 neurons with the Re LU activation function for two-layer FNN and details of the other architectures can be found in Appendix A.4. We run Algorithm 1 in one epoch. For each pair (e.g., 5 and 8), we construct the primary data set by randomly sampling a total of 1000 examples equally from the two classes in the pair. The auxiliary dataset consists of 1000 examples that are randomly drawn from one or two different classes (in the case of two classes, evenly distribute the 1000 examples across the two classes).