Cooperative Knowledge Distillation: A Learner Agnostic Approach

Authors: Michael Livanos, Ian Davidson, Stephen Wong

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that our approach not only outperforms baselines such as transfer learning, selfsupervised learning, and multiple knowledge distillation algorithms on several datasets, but it can also be used in settings where the aforementioned techniques cannot.
Researcher Affiliation Academia Michael Livanos, Ian Davidson, Stephen Wong University of California, Davis mjlivanos@ucdavis.edu, lavidson@cs.ucdavis.edu, stswong@ucdavis.edu
Pseudocode Yes Algorithm 1: Cooperative knowledge distillation.
Open Source Code Yes 1To aid in reproducibility, code is provided on Git Hub https: //github.com/MLivanos/Cooperative-Knowledge-Distillation
Open Datasets Yes Datasets 1 and 2 come from Craigslist (Reese 2021) and Auction Export (Alsenani 2020) respectively, and are used for training and validating models. A test set from Car Guru (Mital 2020) simulates a future distribution all models will have to predict. ... Four datasets are used, each predicting the presence or absence of heart disease from hospitals at different locations: Long Beach (Model 1), Switzerland (Model 2), Hungary (Model 3), and Cleveland (Janosi 1988), all sourced from (Dua and Graff 2017). ... We create ten datasets from the grayscale image dataset Fashion MNIST (Xiao, Rasul, and Vollgraf 2017). ... Three random and non-overlapping subsets are extracted from the Statlog German Credit dataset.
Dataset Splits No The paper mentions 'used for training and validating models' and 'hyperparameter selection maximized validation set accuracy', but does not provide specific percentages, sample counts, or a detailed methodology for the dataset split.
Hardware Specification No No specific hardware details (e.g., CPU/GPU models, memory specifications) used for running the experiments were found in the paper.
Software Dependencies No The paper states that 'code is provided on Git Hub' but does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup No The paper states that 'hyperparameter selection maximized validation set accuracy' but does not provide specific values for hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations.