Anytime Inference with Distilled Hierarchical Neural Ensembles

Authors: Adria Ruiz, Jakob Verbeek9463-9471

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that, compared to previous anytime inference models, HNE provides state-of-the-art accuracy-computation trade-offs on the CIFAR-10/100 and Image Net datasets.
Researcher Affiliation Collaboration Adria Ruiz1, Jakob Verbeek2 1 Institut de Rob otica i Inform atica Industrial, CSIC-UPC, aruiz@iri.upc.edu 2 Facebook AI Research, jjverbeek@fb.com
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes We have released a Pytorch implementation of HNE 1. https://gitlab.com/adriaruizo/dhne-aaai21
Open Datasets Yes We experiment with the CIFAR-10/100 (Krizhevsky 2009) and Image Net (Russakovsky et al. 2015) datasets.
Dataset Splits Yes CIFAR-10/100 contain 50k train and 10k test images from 10 and 100 classes, respectively. Following standard protocols (He et al. 2016), we pre-process the images by normalizing their mean and standard-deviation for each color channel. Additionally, during training we use a data augmentation process where we extract random crops of 32 32 after applying a 4-pixel zero padding to the original image or its horizontal flip. Imagenet is composed by 1.2M and 50k high-resolution images for training and validation, respectively, labelled across 1,000 different categories.
Hardware Specification No The paper mentions general use of GPUs but does not provide specific hardware details (e.g., specific GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions 'Pytorch implementation' but does not specify its version number or any other software dependencies with their versions.
Experiment Setup No The paper states, 'In the supplementary material we present a detailed description of our HNE implementation using Res Net and Mobile Netv2 and provide all the training hyperparameters,' indicating that specific experimental setup details are not in the main text.