Zero Time Waste: Recycling Predictions in Early Exit Neural Networks
Authors: Maciej Wołczyk, Bartosz Wójcik, Klaudia Bałazy, Igor T Podolak, Jacek Tabor, Marek Śmieja, Tomasz Trzcinski
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments across various datasets and architectures to demonstrate that ZTW achieves a significantly better accuracy vs. inference time trade-off than other recently proposed early exit methods. |
| Researcher Affiliation | Collaboration | Maciej Wołczyk Jagiellonian University Bartosz Wójcik Jagiellonian University Klaudia Bałazy Jagiellonian University Igor Podolak Jagiellonian University Jacek Tabor Jagiellonian University Marek Smieja Jagiellonian University Tomasz Trzciński Jagiellonian University, Warsaw University of Technology, Tooploox |
| Pseudocode | Yes | Algorithm 1 Zero Time Waste |
| Open Source Code | Yes | We provide the source code for our experiments at https://github.com/gmum/Zero-Time-Waste. |
| Open Datasets | Yes | For the evaluation in supervised learning, we use three datasets: CIFAR-10, CIFAR-100, and Tiny Image Net, and four commonly used architectures: Res Net-56 [11], Mobile Net [15], Wide Res Net [45], and VGG-16BN [37] as base networks. ... Additionally, we examine how Zero Time Waste performs at reducing waste in a reinforcement learning setting of Atari 2600 environments. |
| Dataset Splits | No | The paper mentions using a 'held-out set' for selecting τ, but does not specify the exact percentages, sample counts, or detailed splitting methodology for training, validation, or test sets needed for reproducibility beyond implicitly using standard benchmark splits. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models used for running its experiments. It mentions 'average number of floating-point operations' as a hardware-agnostic measure. |
| Software Dependencies | No | The paper mentions software like 'PPO algorithm' and 'torchvision package' and 'Stable baselines3' but does not specify any version numbers for these or other software dependencies. |
| Experiment Setup | No | The paper states: 'Appendix A.1 describes the details about the network architecture, hyperparameters, and training process.' indicating that specific experimental setup details like concrete hyperparameter values are not in the main text. |