Cooperative Knowledge Distillation: A Learner Agnostic Approach
Authors: Michael Livanos, Ian Davidson, Stephen Wong
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that our approach not only outperforms baselines such as transfer learning, selfsupervised learning, and multiple knowledge distillation algorithms on several datasets, but it can also be used in settings where the aforementioned techniques cannot. |
| Researcher Affiliation | Academia | Michael Livanos, Ian Davidson, Stephen Wong University of California, Davis mjlivanos@ucdavis.edu, lavidson@cs.ucdavis.edu, stswong@ucdavis.edu |
| Pseudocode | Yes | Algorithm 1: Cooperative knowledge distillation. |
| Open Source Code | Yes | 1To aid in reproducibility, code is provided on Git Hub https: //github.com/MLivanos/Cooperative-Knowledge-Distillation |
| Open Datasets | Yes | Datasets 1 and 2 come from Craigslist (Reese 2021) and Auction Export (Alsenani 2020) respectively, and are used for training and validating models. A test set from Car Guru (Mital 2020) simulates a future distribution all models will have to predict. ... Four datasets are used, each predicting the presence or absence of heart disease from hospitals at different locations: Long Beach (Model 1), Switzerland (Model 2), Hungary (Model 3), and Cleveland (Janosi 1988), all sourced from (Dua and Graff 2017). ... We create ten datasets from the grayscale image dataset Fashion MNIST (Xiao, Rasul, and Vollgraf 2017). ... Three random and non-overlapping subsets are extracted from the Statlog German Credit dataset. |
| Dataset Splits | No | The paper mentions 'used for training and validating models' and 'hyperparameter selection maximized validation set accuracy', but does not provide specific percentages, sample counts, or a detailed methodology for the dataset split. |
| Hardware Specification | No | No specific hardware details (e.g., CPU/GPU models, memory specifications) used for running the experiments were found in the paper. |
| Software Dependencies | No | The paper states that 'code is provided on Git Hub' but does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | No | The paper states that 'hyperparameter selection maximized validation set accuracy' but does not provide specific values for hyperparameters (e.g., learning rate, batch size, number of epochs) or detailed training configurations. |