Optimal Sparsity-Sensitive Bounds for Distributed Mean Estimation

Authors: zengfeng Huang, Ziyue Huang, Yilei WANG, Ke Yi

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have also conducted experimental studies, which demonstrate the advantages of our method and confirm our theoretical findings.
Researcher Affiliation Academia Zengfeng Huang School of Data Science Fudan University huangzf@fudan.edu.cn Ziyue Huang Department of CSE HKUST zhuangbq@cse.ust.hk Yilei Wang Department of CSE HKUST ywanggq@cse.ust.hk Ke Yi Department of CSE HKUST yike@cse.ust.hk
Pseudocode Yes Algorithm 1 Scaling and Rounding (Sa R) input v Rd and a scaling factor F 1: u = v F 2: Randomized rounding: for j = 1, , d ˆuj = uj + 1, with probability uj uj uj , otherwise. 3: return ˆu
Open Source Code No The paper does not provide any explicit statements or links indicating that the source code for the described methodology is publicly available.
Open Datasets Yes We used the MNIST [13] data set, uniformly or non-uniformly distributed across 10 clients.
Dataset Splits No The paper describes data distribution among clients ('distributed across 10 clients', '16 vectors, each held by a different client') but does not specify standard training, validation, and testing dataset splits with percentages or sample counts.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts, or detailed computer specifications) used for running experiments.
Software Dependencies No The paper does not provide specific software dependency details (e.g., library or solver names with version numbers) needed to replicate the experiments.
Experiment Setup Yes The number of clusters and iterations is set to 10 and 30 respectively. [...] The number of clients is set to 100 and the number of iterations is set to 15. [...] we used different values of k (quantization level) for Suresh et al. s algorithm, k = 32 for less communication and k = 512 for less error, and other methods are tuned to achieve the same objective.