reproducibilityindex.ai

Efficient On-Device Models using Neural Projections

Authors: Sujith Ravi

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section we demonstrate the effectiveness of the proposed approach with several experiments on different benchmark datasets and classiﬁcation tasks involving visual recognition and language understanding. The experiments are run in Tensor Flow (Abadi et al., 2015).Evaluation. For each task, we compute the performance of each model in terms of precision@1, i.e., accuracy % of the top predicted output class. Models were trained using multiple runs, each experiment was run for a ﬁxed number of (400k) time steps with a batch size of 200 for the visual tasks and 100 for the text classiﬁcation task.
Researcher Affiliation	Industry	Sujith Ravi 1 1Google Research, Mountain View, California, USA. Correspondence to: Sujith Ravi <sravi@google.com>.
Pseudocode	No	The paper describes the proposed methods and operations using natural language and mathematical equations, but it does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using 'Tensor Flow (Abadi et al., 2015)' and 'Tensor Flow Lite (TFLite) open-source library' but does not provide a link or explicit statement about releasing the source code for their own described methodology.
Open Datasets	Yes	Tasks. We apply our neural projection approach and compare against baseline systems on benchmark image classiﬁcation datasets MNIST, Fashion-MNIST (Xiao et al., 2017) and CIFAR-10.The dataset contains 60k instances for training and 10k instances for testing. (MNIST)Fashion MNIST is a recent real-world dataset with similar grayscale style images and splits as MNIST but it is a much harder task.CIFAR-10 dataset contains colour images with 50k for training, 10k for testing.Smart Reply Intent is a real-world semantic intent classiﬁcation task for automatically generating short email responses (Kannan et al., 2016).ATIS is a benchmark corpus used in the speech and dialog community (Tur et al., 2010).
Dataset Splits	Yes	The dataset contains 60k instances for training and 10k instances for testing. We hold out 5k instances from the training split as dev set for tuning system parameters. (MNIST)Smart Reply Intent... with 20 intent classes, 5483 samples (3832 for training, 560 for validation and 1091 for testing).
Hardware Specification	No	The paper mentions general computing resources like 'high-performance distributed computing involving several CPU cores or graphics processing units (GPUs)' for training. It also mentions 'a Pixel phone' for measuring inference latency, but this is not the hardware used for the experiments themselves. No specific hardware models or detailed specifications for the experimental setup are provided.
Software Dependencies	No	The paper states: 'The experiments are run in Tensor Flow (Abadi et al., 2015).' and mentions 'Tensor Flow Lite (TFLite) open-source library.' However, it does not provide specific version numbers for TensorFlow or any other libraries used.
Experiment Setup	Yes	Models were trained using multiple runs, each experiment was run for a ﬁxed number of (400k) time steps with a batch size of 200 for the visual tasks and 100 for the text classiﬁcation task.In our experiments, we set them to λ1 = 1.0, λ2 = 0.1, λ3 = 1.0. (hyperparameters for loss function)For MNIST, we use a feed-forward NN architecture (3 layers, 1000 hidden units per layer) with L2-regularization as one of the baselines and trainer network for Projection Net. (architectural details)We use an RNN sequence model with multilayer LSTM architecture (2 layers, 100 dimensions) as the baseline for the Smart Reply Intent task. (architectural details)