Structured Cooperative Learning with Graphical Model Priors
Authors: Shuangtong Li, Tianyi Zhou, Xinmei Tian, Dacheng Tao
ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate SCoo L and compare it with existing decentralized learning methods on an extensive set of benchmarks, on which SCoo L always achieves the highest accuracy of personalized models and significantly outperforms other baselines on communication efficiency. |
| Researcher Affiliation | Academia | 1University of Science and Technology of China 2University of Maryland, College Park 3Institute of Artificial Intelligence, Hefei Comprehensive National Science Center 4The University of Sydney. |
| Pseudocode | Yes | Algorithm 1: Structured Cooperative Learning |
| Open Source Code | Yes | Our code is available at https://github.com/Shuangtong Li/SCoo L. |
| Open Datasets | Yes | CIFAR-10 (Krizhevsky et al., 2009), CIFAR-100, and Mini Image Net (Ravi & Larochelle, 2017) |
| Dataset Splits | Yes | We follow the hyperparameter values proposed in the baselines papers except the learning rate, which is a constant tuned/selected from [0.01, 0.05, 0.1] for the best validation accuracy. |
| Hardware Specification | No | The paper mentions model architectures (e.g., 'two-layer CNN', 'four-layer CNN') and modifications (e.g., 'replace all the BN layers... with group-norm layers'), but does not specify any hardware details like GPU models, CPU types, or memory. |
| Software Dependencies | No | The paper mentions software components like 'Adam' and 'SGD' and specifies hyperparameters, but it does not provide specific version numbers for any software libraries or dependencies (e.g., 'We use Adam (Kingma & Ba, 2014)...', 'we use SGD with learning rate of 0.01'). |
| Experiment Setup | Yes | In all methods local model training, we use SGD with learning rate of 0.01, weight decay of 5e-4, and batch size of 10. We follow the hyperparameter values proposed in the baselines papers except the learning rate, which is a constant tuned/selected from [0.01, 0.05, 0.1] for the best validation accuracy. |