Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Authors: Ali Ramezani-Kebrya, Kimon Antonakopoulos, Igor Krawczuk, Justin Deschenaux, Volkan Cevher
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we validate our theoretical results by providing real-world experiments and training generative adversarial networks on multiple GPUs. |
| Researcher Affiliation | Academia | ali@uio.no firstname.lastname@epfl.ch These authors contributed equally to this work. Department of Informatics, University of Oslo. Work performed at EPFL (LIONS) and Aalborg University. Laboratory for Information and Inference Systems (LIONS), EPFL. |
| Pseudocode | Yes | Algorithm 1 Q-Gen X: Loops are executed in parallel on processors. At certain steps, each processor computes sufficient statistics of a parametric distribution to estimate distribution of dual vectors. |
| Open Source Code | Yes | Open source code will be released at https://github.com/LIONS-EPFL/qgenx |
| Open Datasets | Yes | train a WGAN-GP (Arjovsky et al., 2017) on CIFAR10 (Krizhevsky, 2009). |
| Dataset Splits | No | The paper states it uses CIFAR10, which has standard splits, but does not explicitly provide the training/validation/test split percentages or sample counts in the text. |
| Hardware Specification | Yes | The experiments were performed on 3 Nvidia V100 GPUs (1 per node) using a Kubernetes cluster and an image built on the torch_cgx Docker image. |
| Software Dependencies | No | The paper mentions "torch_cgx pytorch extension", "Open MPI", and "Weights and Biases", but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | we share an effective batch size of 1024 across 3 nodes (strong scaling) connected via Ethernet, and use Layernorm (Ba et al., 2016) instead of Batchnorm (Ioffe & Szegedy, 2015). The results are shown in Figure 1 (left) showing evolution of FID. |