Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
DoCoM: Compressed Decentralized Optimization with Near-Optimal Sample Complexity
Authors: Chung-Yiu Yau, Hoi To Wai
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments demonstrate that our algorithm outperforms several state-of-the-art algorithms in practice. We empirically evaluate the performance of Do Co M on training linear models and deep learning models using synthetic and real data, on non-convex losses. |
| Researcher Affiliation | Academia | Chung-Yiu Yau EMAIL The Chinese University of Hong Kong. Hoi-To Wai EMAIL The Chinese University of Hong Kong. |
| Pseudocode | Yes | Algorithm 1 Do Co M Algorithm |
| Open Source Code | No | The paper does not contain an explicit statement about releasing the source code for the Do Co M algorithm, nor does it provide a link to a code repository. |
| Open Datasets | Yes | Synthetic Data with Linear Model Consider a set of synthetic data generated with the leaf benchmarking framework (Caldas et al., 2019)... MNIST Data with Feed-forward Network ...on the MNIST dataset... FEMNIST Data with Le Net-5 ...on the FEMNIST dataset. |
| Dataset Splits | No | The paper describes how data is partitioned among agents (e.g., 'm = 1443 samples partitioned into n = 25 non-i.i.d. portions', 'samples are partitioned into n = 10 agents where each agent only gets 1 class of samples'), but it does not specify conventional training, validation, or test dataset splits in terms of percentages or absolute counts. |
| Hardware Specification | Yes | We run the decentralized optimization algorithms on a 40 threads Intel(R) Xeon(R) Gold 6148 CPU @ 2.40GHz server with MPI-enabled Py Torch and evaluate the performance of trained models on a Tesla K80 GPU server. |
| Software Dependencies | No | The paper mentions 'MPI-enabled Py Torch' but does not provide specific version numbers for PyTorch, MPI, or any other software dependencies. |
| Experiment Setup | Yes | For all algorithms we choose the learning rate η from {0.1, 0.01, 0.001}, and fix the regularization parameter as λ = 10 4 ... For Do Co M and GT-HSGD, we choose the best momentum parameter β in {0.0001, 0.001, 0.01, 0.1, 0.5, 0.9} and fix the initial batch number as b0,i = mi. We choose the batch sizes such that all algorithms spend the same amount of computation on stochastic gradient per iteration... Table 2: Tuned hyper-parameters for linear model on synthetic dataset... Table 3: Tuned hyper-parameters for 1 layer feed-forward network on MNIST... Table 4: Tuned hyper-parameters for Le Net-5 on FEMNIST. |