reproducibilityindex.ai

A Kernel Perspective on Distillation-based Collaborative Learning

Authors: Sejun Park, Kihun Hong, Ganguk Hwang

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we conduct experiments on DCL-KR and DCL-NN. To illustrate the superiority of our algorithms, we compare them with several baselines on various regression tasks. Experimental results show that DCL-KR achieves the same performance as the centralized model, even beyond the theoretical results. We also observe that DCL-NN signiﬁcantly outperforms previous DCL frameworks in most settings.
Researcher Affiliation	Academia	Sejun Park Kihun Hong Ganguk Hwang Department of Mathematical Sciences Korea Advanced Institute of Science and Technology {sejunpark, nuri9911, guhwang}@kaist.ac.kr
Pseudocode	Yes	Algorithm 1 DCL-KR Algorithm; Algorithm 2 DCL-NN Algorithm
Open Source Code	Yes	The code is also provided via the supplementary material. ... The code is provided via the supplementary material.
Open Datasets	Yes	Datasets We use the following six regression datasets to evaluate the performance. ... (1) Toy-1D [33] and (2) Toy-3D [6] are synthetic datasets... (3) Energy is a tabular dataset from the UCI database [12]... (4) Rotated MNIST is an image dataset where it aims to predict the rotation angles for given rotated images of the MNIST [11] images. (5) UTKFace [71] and (6) IMDB-WIKI [42, 52] are image datasets for age estimation.
Dataset Splits	Yes	We use 12,000 training data points distributed across the parties. We use 6,000 samples as public inputs and 1,000 samples for testing. (Energy dataset) ... We use 200,000 images as the entire training data, 50,000 images as public inputs, and 50,000 images as test data. (Rotated MNIST) ... We use 12,544 samples for training and 1,039 samples for testing. We have 6,234 public inputs. (UTKFace) ... We use 147,107 images as the entire training data, 36,780 images as public inputs, and 56,087 images as test data. (IMDB-WIKI)
Hardware Specification	Yes	We simulate a decentralized setting on a single deep learning workstation (Intel(R) Xeon(R) Gold 6430 with one NVIDIA Ge Force RTX 4090 GPU and 189GB RAM).
Software Dependencies	No	The experiments are implemented in Py Torch. ... All optimizers used are Adam [26]. While PyTorch and Adam are mentioned, no specific version numbers are provided for these software components.
Experiment Setup	Yes	Hyperparameters: T: total communication round, E: the number of local iterations at each communication round, η: learning rate (Algorithm 1) ... We set the learning rate η = 0.5... The number of local iterations E for DCL-KR is set to 5. ... Tables 5, 6, 7, 8, and 9 list various hyperparameters such as batch size, learning rate, and communication rounds for different algorithms and datasets.