Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

BEER: Fast $O(1/T)$ Rate for Decentralized Nonconvex Optimization with Communication Compression

Authors: Haoyu Zhao, Boyue Li, Zhize Li, Peter Richtarik, Yuejie Chi

NeurIPS 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments are also provided to corroborate our theory and confirm the practical superiority of BEER in the data heterogeneous regime.
Researcher Affiliation Academia Haoyu Zhao Princeton University EMAIL Boyue Li Carnegie Mellon University EMAIL Zhize Li Carnegie Mellon University EMAIL Peter Richtárik King Abdullah University of Science and Technology EMAIL Yuejie Chi Carnegie Mellon University EMAIL
Pseudocode Yes Algorithm 1 BEER: BEtter compr Ession for decent Ralized optimization
Open Source Code Yes The code can be accessed at: https://github.com/liboyue/beer.
Open Datasets Yes We run experiments on two nonconvex problems to compare with the baseline algorithms both with and without communication compression: logistic regression with a nonconvex regularizer [52] on the a9a dataset [5], and training a 1-hidden layer neural network on the MNIST dataset [20].
Dataset Splits No The paper mentions splitting 'unshuffled datasets evenly to 10 clients' but does not provide specific train/validation/test percentages, sample counts, or explicit instructions for how the main dataset was split for training, validation, or testing purposes beyond distributing it among clients.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments. The author checklist explicitly states 'No' for 'Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)?'.
Software Dependencies No The paper mentions using a 'biased gsgdb compression [1]' but does not list specific software dependencies with version numbers (e.g., Python, PyTorch, TensorFlow versions, or other libraries/solvers).
Experiment Setup Yes Moreover, we use the same best-tuned learning rate η = 0.1, batch size b = 100, and biased compression operator (gsgdb) [1] for BEER and CHOCO-SGD on both experiments.