Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Distributed Mean Estimation with Limited Communication
Authors: Ananda Theertha Suresh, Felix X. Yu, Sanjiv Kumar, H. Brendan McMahan
ICML 2017 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We ο¬nally demonstrate the practicality of our algorithms by applying them to distributed Lloyd s algorithm for kmeans and power iteration for PCA. and We demonstrate two applications in the rest of this section. The experiments are performed on the MNIST (d = 1024) and CIFAR (d = 512) datasets. |
| Researcher Affiliation | Industry | 1Google Research, New York, NY, USA 2Google Research, Seattle, WA, USA. |
| Pseudocode | No | Information insufficient. The paper describes algorithms in prose and mathematical notation but does not include any blocks labeled "Pseudocode" or "Algorithm". |
| Open Source Code | No | Information insufficient. The paper does not mention providing open-source code for the described methodology. |
| Open Datasets | Yes | The experiments are performed on the MNIST (d = 1024) and CIFAR (d = 512) datasets. |
| Dataset Splits | No | Information insufficient. The paper mentions the datasets used but does not provide specific details on how they were split into training, validation, or test sets. |
| Hardware Specification | No | Information insufficient. The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running its experiments. |
| Software Dependencies | No | Information insufficient. The paper does not provide specific software dependencies or their version numbers. |
| Experiment Setup | Yes | Here we test two settings: 16 quantization levels and 32 quantization levels. We set both the number of centers and number of clients to 10. The dataset is distributed over 100 clients. |