Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Distributed Extra-gradient with Optimal Complexity and Communication Guarantees
Authors: Ali Ramezani-Kebrya, Kimon Antonakopoulos, Igor Krawczuk, Justin Deschenaux, Volkan Cevher
ICLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, we validate our theoretical results by providing real-world experiments and training generative adversarial networks on multiple GPUs. |
| Researcher Affiliation | Academia | EMAIL EMAIL These authors contributed equally to this work. Department of Informatics, University of Oslo. Work performed at EPFL (LIONS) and Aalborg University. Laboratory for Information and Inference Systems (LIONS), EPFL. |
| Pseudocode | Yes | Algorithm 1 Q-Gen X: Loops are executed in parallel on processors. At certain steps, each processor computes sufficient statistics of a parametric distribution to estimate distribution of dual vectors. |
| Open Source Code | Yes | Open source code will be released at https://github.com/LIONS-EPFL/qgenx |
| Open Datasets | Yes | train a WGAN-GP (Arjovsky et al., 2017) on CIFAR10 (Krizhevsky, 2009). |
| Dataset Splits | No | The paper states it uses CIFAR10, which has standard splits, but does not explicitly provide the training/validation/test split percentages or sample counts in the text. |
| Hardware Specification | Yes | The experiments were performed on 3 Nvidia V100 GPUs (1 per node) using a Kubernetes cluster and an image built on the torch_cgx Docker image. |
| Software Dependencies | No | The paper mentions "torch_cgx pytorch extension", "Open MPI", and "Weights and Biases", but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | we share an effective batch size of 1024 across 3 nodes (strong scaling) connected via Ethernet, and use Layernorm (Ba et al., 2016) instead of Batchnorm (Ioffe & Szegedy, 2015). The results are shown in Figure 1 (left) showing evolution of FID. |