Efficient On-Device Models using Neural Projections
Authors: Sujith Ravi
ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section we demonstrate the effectiveness of the proposed approach with several experiments on different benchmark datasets and classification tasks involving visual recognition and language understanding. The experiments are run in Tensor Flow (Abadi et al., 2015).Evaluation. For each task, we compute the performance of each model in terms of precision@1, i.e., accuracy % of the top predicted output class. Models were trained using multiple runs, each experiment was run for a fixed number of (400k) time steps with a batch size of 200 for the visual tasks and 100 for the text classification task. |
| Researcher Affiliation | Industry | Sujith Ravi 1 1Google Research, Mountain View, California, USA. Correspondence to: Sujith Ravi <sravi@google.com>. |
| Pseudocode | No | The paper describes the proposed methods and operations using natural language and mathematical equations, but it does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper mentions using 'Tensor Flow (Abadi et al., 2015)' and 'Tensor Flow Lite (TFLite) open-source library' but does not provide a link or explicit statement about releasing the source code for their own described methodology. |
| Open Datasets | Yes | Tasks. We apply our neural projection approach and compare against baseline systems on benchmark image classification datasets MNIST, Fashion-MNIST (Xiao et al., 2017) and CIFAR-10.The dataset contains 60k instances for training and 10k instances for testing. (MNIST)Fashion MNIST is a recent real-world dataset with similar grayscale style images and splits as MNIST but it is a much harder task.CIFAR-10 dataset contains colour images with 50k for training, 10k for testing.Smart Reply Intent is a real-world semantic intent classification task for automatically generating short email responses (Kannan et al., 2016).ATIS is a benchmark corpus used in the speech and dialog community (Tur et al., 2010). |
| Dataset Splits | Yes | The dataset contains 60k instances for training and 10k instances for testing. We hold out 5k instances from the training split as dev set for tuning system parameters. (MNIST)Smart Reply Intent... with 20 intent classes, 5483 samples (3832 for training, 560 for validation and 1091 for testing). |
| Hardware Specification | No | The paper mentions general computing resources like 'high-performance distributed computing involving several CPU cores or graphics processing units (GPUs)' for training. It also mentions 'a Pixel phone' for measuring inference latency, but this is not the hardware used for the experiments themselves. No specific hardware models or detailed specifications for the experimental setup are provided. |
| Software Dependencies | No | The paper states: 'The experiments are run in Tensor Flow (Abadi et al., 2015).' and mentions 'Tensor Flow Lite (TFLite) open-source library.' However, it does not provide specific version numbers for TensorFlow or any other libraries used. |
| Experiment Setup | Yes | Models were trained using multiple runs, each experiment was run for a fixed number of (400k) time steps with a batch size of 200 for the visual tasks and 100 for the text classification task.In our experiments, we set them to λ1 = 1.0, λ2 = 0.1, λ3 = 1.0. (hyperparameters for loss function)For MNIST, we use a feed-forward NN architecture (3 layers, 1000 hidden units per layer) with L2-regularization as one of the baselines and trainer network for Projection Net. (architectural details)We use an RNN sequence model with multilayer LSTM architecture (2 layers, 100 dimensions) as the baseline for the Smart Reply Intent task. (architectural details) |