Enhancing Simple Models by Exploiting What They Already Know

Authors: Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The benefit of these contributions is witnessed in the experiments where on 6 UCI datasets and CIFAR-10 we outperform competitors in a majority (16 out of 27) of the cases and tie for best performance in the remaining cases.
Researcher Affiliation Industry 1IBM Research, Yorktown Heights, NY, USA.
Pseudocode Yes Algorithm 1 Our proposed method SRatio.
Open Source Code No The paper does not provide concrete access to source code for the methodology described.
Open Datasets Yes We experiment on 6 real datasets from UCI repository (Dheeru & Karra Taniskidou, 2017): Ionosphere, Ovarian Cancer (OC), Heart Disease (HD), Waveform, Human Activity Recognition (HAR), Musk as well as CIFAR-10 (Krizhevsky, 2009).
Dataset Splits Yes Datasets are randomly split into 70% train and 30% test. Results for all methods are averaged over 10 random splits and reported in Table 2 with 95% confidence intervals. Optimal values for γ and β are found using 10-fold crossvalidation. ... 500 samples from the CIFAR-10 test set are used for validation and hyperparameter tuning (details in supplement).
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies Yes Tensorflow 1.5.0 was used for CIFAR-10 experiments.
Experiment Setup Yes Datasets are randomly split into 70% train and 30% test. Optimal values for γ and β are found using 10-fold crossvalidation. The complex model is an 18 unit Res Net with 15 residual (Res) blocks/units. ... Distillation (Geoffrey Hinton, 2015) employs cross-entropy loss with soft targets to train the simple model. The soft targets are the softmax outputs of the complex model s last layer rescaled by temperature t = 0.5 which was selected based on cross-validation.