reproducibilityindex.ai

Communication-Efficient Distributed Optimization with Quantized Preconditioners

Authors: Foivos Alimisis, Peter Davies, Dan Alistarh

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We also validate our ﬁndings experimentally, showing fast convergence and reduced communication.
Researcher Affiliation	Collaboration	1Department of Mathematics, University of Geneva, Switzerland (work done while at IST Austria) 2IST Austria 3Neural Magic, US
Pseudocode	Yes	The algorithm is presented in a numbered list of steps under section 3.1 'The Algorithm' and 4.1 'Algorithm Description', formatted as structured steps for a method.
Open Source Code	No	The paper does not provide any statement or link indicating the availability of open-source code for the described methodology.
Open Datasets	Yes	Dataset We use the dataset cpusmall scale from LIBSVM (Chang & Lin, 2011). ... We demonstrate the methods on the phishing and german numer datasets from the LIBSVM collection (Chang & Lin, 2011)
Dataset Splits	No	The paper does not provide specific details about training, validation, or test dataset splits (e.g., percentages or sample counts).
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running the experiments.
Software Dependencies	No	The paper mentions using QSGD (Alistarh et al., 2016) and the Hadamard-rotation based method (Suresh et al., 2017) for gradient quantization but does not specify version numbers for these or any other software dependencies.
Experiment Setup	Yes	The learning rate (lr in the ﬁgure titles) is set close to the maximum for which gradient descent will converge... The number of bits per coordinate used to quantize gradients (qb) and preconditioners (pb) are also shown; the latter is an average since the quantization method uses a variable number of bits. ... we test each with learning rates in {2^0, 2^-1, 2^-2, . . . }, and plot the highest rate for which the method stably converges.