reproducibilityindex.ai

CANITA: Faster Rates for Distributed Convex Optimization with Communication Compression

Authors: Zhize Li, Peter Richtarik

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6 Experiments In this section, we demonstrate the performance of our accelerated method CANITA (Algorithm 1) and previous methods QSGD and DIANA (the theoretical convergence results of these algorithms can be found in Table 1) with different compression operators on the logistic regression problem, min x Rd f(x) := 1 n Pn i=1 log 1 + exp( bia T i x) , (14) where {ai, bi}n i=1 Rd { 1} are data samples. We use three standard datasets: a9a, mushrooms, and w8a in the experiments. All datasets are downloaded from LIBSVM [4]. In Figures 1 3, we compare our CANITA with QSGD and DIANA with three compression operators: random sparsiﬁcation (left), natural compression (middle), and random quantization (right) on three datasets: a9a (Figure 1), mushrooms (Figure 2), and w8a (Figure 3). The x-axis and y-axis represent the number of communication bits and the training loss, respectively.
Researcher Affiliation	Academia	Zhize Li KAUST zhize.li@kaust.edu.sa Peter Richtárik KAUST peter.richtarik@kaust.edu.sa
Pseudocode	Yes	Algorithm 1 Distributed compressed accelerated ANITA method (CANITA)
Open Source Code	No	The paper does not contain any explicit statement about providing open-source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	We use three standard datasets: a9a, mushrooms, and w8a in the experiments. All datasets are downloaded from LIBSVM [4].
Dataset Splits	No	The paper mentions using 'three standard datasets: a9a, mushrooms, and w8a' but does not specify any explicit training, validation, or test split percentages or sample counts for these datasets.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU or CPU models, or memory specifications used for running experiments.
Software Dependencies	No	The paper mentions using LIBSVM but does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	In our experiments, we directly use the theoretical stepsizes and parameters for all three algorithms: QSGD [1, 24], DIANA [12], our CANITA (Algorithm 1). To compare with the settings of DIANA and CANITA, we use local gradients (not stochastic gradients) in QSGD.