Network Approximation using Tensor Sketching

Authors: Shiva Prasad Kasiviswanathan, Nina Narodytska, Hongxia Jin

IJCAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we experimentally demonstrate the effectiveness of our proposed network approximation approach. Metrics. We define compression rate as the ratio between the number of parameters in the reduced (compressed) network architecture and the number of parameters in the original (uncompressed) network architecture. The top-1 error of a trained model is denoted by ERRTOP-1. Datasets. We use 5 popular image datasets: CIFAR10, SVHN, STL10, Image Net10 (a subset of Image Net1000 dataset), and Places2. Network Architectures. We present our experiments on two different network architectures: Network-in-Network [Lin et al., 2014] (Nin N) and Goog Le Net [Szegedy et al., 2015] (which we use for the Places2 dataset).
Researcher Affiliation Industry Shiva Prasad Kasiviswanathan1 , Nina Narodytska2 and Hongxia Jin3 1 Amazon AWS AI, USA 2 VMware Research, USA 3 Samsung Research America, USA kasivisw@gmail.com, nnarodytska@vmware.com, hongxia.jin@samsung.com
Pseudocode No The paper does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any links to source code or explicitly state that source code for the described methodology is publicly available.
Open Datasets Yes Datasets. We use 5 popular image datasets: CIFAR10, SVHN, STL10, Image Net10 (a subset of Image Net1000 dataset), and Places2.
Dataset Splits No The paper mentions training on datasets and reports results on "top-1 error," implying a test set, but it does not explicitly specify the training, validation, and test splits (e.g., "80/10/10 split" or specific sample counts for each split) for its experiments. It does mention "validation" in the context of "batch normalization", but not as a dataset split.
Hardware Specification No The paper does not explicitly describe the specific hardware (e.g., CPU, GPU models, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions general deep learning frameworks (e.g., "deep learning frameworks [Chetlur et al., 2014]") but does not provide specific software dependencies with version numbers for reproducibility.
Experiment Setup No The paper describes aspects of the model architecture (e.g., "stride of 1 and zeropadding of 0" for convolution) but does not provide specific experimental setup details such as hyperparameter values (learning rate, batch size, number of epochs) or optimizer settings.