Communication Bounds for the Distributed Experts Problem

Authors: Zhihao Jia, Qi Pang, Trung Tran, David Woodruff, Zhihao Zhang, Wenting Zheng

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, we implement our protocols and demonstrate empirical savings on the HPO-B benchmarks.
Researcher Affiliation Academia Zhihao Jia Carnegie Mellon University zhihao@cmu.edu Qi Pang Carnegie Mellon University qipang@cmu.edu Trung Tran University of Pittsburgh tbt8@pitt.edu David Woodruff Carnegie Mellon University dwoodruf@cs.cmu.edu Zhihao Zhang Carnegie Mellon University zhihaoz3@cs.cmu.edu Wenting Zheng Carnegie Mellon University wenting@cmu.edu
Pseudocode Yes Algorithm 1 DEWA-S, Algorithm 2 DEWA-S-P, Algorithm 3 DEWA-M, Algorithm 4 DEWA-L, Algorithm 5 Exponential Weight Algorithm (EWA), Algorithm 6 An algorithm that reduces the e-DIFFDIST to the summation-based distributed experts problem in the broadcast model.
Open Source Code Yes Does the paper provide open access to the data and code, with sufficient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [Yes] Justification: we provide running scripts for all experiments in the paper
Open Datasets Yes In this section, we demonstrate the effectiveness of our algorithms on the HPO-B benchmark (Arango et al., 2021) under two setups: 1. Message-passing model with summation aggregation function and 2. Broadcast model with maximum aggregation function.
Dataset Splits No The paper mentions using the HPO-B benchmark and synthetic datasets but does not explicitly state the specific training, validation, and test dataset splits (e.g., percentages or counts) used for reproduction.
Hardware Specification Yes The experiments are run on an Ubuntu 22.04 LTS server equipped with a 12 Intel Core i7-12700K Processor and 32GB RAM.
Software Dependencies No The paper specifies the operating system ('Ubuntu 22.04 LTS') but does not list specific software dependencies, libraries, or frameworks with their version numbers (e.g., Python, PyTorch, TensorFlow, CUDA versions) that are necessary for reproducibility.
Experiment Setup Yes We set the learning rate η = 0.1, the number of servers to be s = 50, the number of experts to be n = 10^0, and the total days to be T = 10^5 for be = 1 and to be T = 10^4 for be = n. We set the sampling budget bs = 2 for BASE-S and BASE-S-P.