Enhancing Simple Models by Exploiting What They Already Know
Authors: Amit Dhurandhar, Karthikeyan Shanmugam, Ronny Luss
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The benefit of these contributions is witnessed in the experiments where on 6 UCI datasets and CIFAR-10 we outperform competitors in a majority (16 out of 27) of the cases and tie for best performance in the remaining cases. |
| Researcher Affiliation | Industry | 1IBM Research, Yorktown Heights, NY, USA. |
| Pseudocode | Yes | Algorithm 1 Our proposed method SRatio. |
| Open Source Code | No | The paper does not provide concrete access to source code for the methodology described. |
| Open Datasets | Yes | We experiment on 6 real datasets from UCI repository (Dheeru & Karra Taniskidou, 2017): Ionosphere, Ovarian Cancer (OC), Heart Disease (HD), Waveform, Human Activity Recognition (HAR), Musk as well as CIFAR-10 (Krizhevsky, 2009). |
| Dataset Splits | Yes | Datasets are randomly split into 70% train and 30% test. Results for all methods are averaged over 10 random splits and reported in Table 2 with 95% confidence intervals. Optimal values for γ and β are found using 10-fold crossvalidation. ... 500 samples from the CIFAR-10 test set are used for validation and hyperparameter tuning (details in supplement). |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | Yes | Tensorflow 1.5.0 was used for CIFAR-10 experiments. |
| Experiment Setup | Yes | Datasets are randomly split into 70% train and 30% test. Optimal values for γ and β are found using 10-fold crossvalidation. The complex model is an 18 unit Res Net with 15 residual (Res) blocks/units. ... Distillation (Geoffrey Hinton, 2015) employs cross-entropy loss with soft targets to train the simple model. The soft targets are the softmax outputs of the complex model s last layer rescaled by temperature t = 0.5 which was selected based on cross-validation. |