Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

SelfXit: An Unsupervised Early Exit Mechanism for Deep Neural Networks

Authors: Hossein KhademSohi, Mohammadamin Abedi, Yani Ioannou, Steve Drew, Pooyan Jamshidi, Hadi Hemmati

TMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The results of our experiments on two vision tasks (image classification and object detection) show that, on average, early exiting can reduce the computational complexity of these services up to 58% (in terms of FLOP count) and improve their inference latency up to 46% with a low to zero reduction in accuracy. Self Xit also outperforms existing methods, particularly on complex models and larger datasets. It achieves a notable reduction in latency of 51.6% and 30.4% on CIFAR100/Resnet50, with an accompanying increase in accuracy of 2.31% and 0.72%, on average, compared to GATI and Branchy Net.
Researcher Affiliation Academia Hossein Khadem Sohi EMAIL Mohammadamin Abedi EMAIL Yani Ioannou EMAIL Steve Drew EMAIL Department of Electrical and Software Engineering University of Calgary Calgary, AB, Canada Pooyan Jamshidi EMAIL Department of Computing Science and Engineering University of South Carolina Columbia, SC, USA Hadi Hemmati EMAIL Department of Electrical Engineering and Computer Science University of York Toronto, ON, Canada
Pseudocode Yes Algorithm 1 Early Exit-enabled model inference Require: Backbone The original model Require: Early Exit Layers List of early exit layers Require: Layer As part of Backbone, including the associated early exit model and threshold 1: procedure Forward Pass(X, callback)
Open Source Code Yes We released code and replication package at https://github.com/hoseinkhs/Auto Cache Layer/.
Open Datasets Yes We used the labeled face in the wild (LFW) dataset for face recognition, which contains 13,233 images of 5,749 individuals. ... We also used CIFAR10, CIFAR100, and Image Net test sets (Krizhevsky & Hinton, 2009; Russakovsky et al., 2015) for object classification... We used the City Scape dataset to assess the presence of pedestrians ((Cordts et al., 2016)). ... The data set we use is from the Criteo data set on Kaggle (Jean-Baptiste Tien, 2014)...
Dataset Splits Yes Each data set mentioned above represents an inference workload for the models. Thus, we split each one into training, validation and test partitions with 50%, 20%, and 30% proportionality, respectively.
Hardware Specification Yes In our experiments, we have used an Intel(R) Core(TM) i7-10700K CPU @ 3.80GH to measure on-CPU inference times and an NVIDIA Ge Force RTX 3070 GPU to measure on-GPU inference time.
Software Dependencies Yes The software environment for our experiments included Ubuntu 20.04, Python 3.7, and Py Torch 1.1.
Experiment Setup Yes Assign confidence thresholds to built models to determine early exit hits ... Table 2 describes the collaborative performance of the early exit models within a confidence threshold of 0.9. ... Table 5: end-to-end evaluation of early exit-enabled models improvement in average inference latency, batch size = 32