CoBo: Collaborative Learning via Bilevel Optimization
Authors: Diba Hashemi, Lie He, Martin Jaggi
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, COBO achieves superior performance, surpassing popular personalization algorithms by 9.3% in accuracy on a task with high heterogeneity, involving datasets distributed among 80 clients. We present three experiments to demonstrate the practical effectiveness of COBO. |
| Researcher Affiliation | Collaboration | Diba Hashemi EPFL diba.hashemi@epfl.ch; Tencent Inc. liam.he15@gmail.com; Martin Jaggi EPFL martin.jaggi@epfl.ch |
| Pseudocode | Yes | Algorithm 1 COBO: Collaborative Learning via Bilevel Optimization |
| Open Source Code | Yes | The code is available at: https://github.com/epfml/CoBo. |
| Open Datasets | Yes | using the CIFAR-100 dataset for multi-task learning [21]... subsets of the Wiki-40B dataset [12] |
| Dataset Splits | No | The paper describes how data is distributed among clients for collaborative learning tasks and mentions relying on 'validation performance' for some baselines, but it does not provide explicit training/validation/test dataset split percentages or counts. |
| Hardware Specification | Yes | For cross-silo experiments we employed a single NVIDIA V-100 GPU with 32GB memory, and moved to four NVIDIA V-100 GPUs with 32 GB memory for cross-device experiment. Training is performed on a single NVIDIA A-100 GPU with 40GB memory. |
| Software Dependencies | No | The paper mentions models and architectures like ResNet-9, GPT-2, and LoRA, but it does not provide specific software dependencies or library names with their version numbers (e.g., PyTorch 1.9, Python 3.8). |
| Experiment Setup | Yes | We use the fix batch size of 128 for cross-device, and cross-silo experiments on CIFAR-100. We tune each method for the optimal learning rate individually: we use learning rate of 0.1 for ditto, 0.05 for Federated Clustering (FC), and 0.01 for all other methods. For Language modeling experiment, we conducted the experiments with the learning rate of 0.002, batch size of 50, and 4 accumulation steps. We also used the context length of 512, dropout rate of 0.1, and Lo RA module with rank 4. |