Decentralized Local Stochastic Extra-Gradient for Variational Inequalities

Authors: Aleksandr Beznosikov, Pavel Dvurechenskii, Anastasiia Koloskova, Valentin Samokhin, Sebastian U. Stich, Alexander Gasnikov

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify our theoretical results in numerical experiments and demonstrate the practical effectiveness of the proposed scheme. In particular, we train the DCGAN [79] architecture on the CIFAR-10 [51] dataset. and 5 Experiments In this section, we present two sets of experiments to validate the performance of Algorithm 1.
Researcher Affiliation Collaboration Aleksandr Beznosikov Innopolis University , MIPT , HSE University and Yandex
Pseudocode Yes Algorithm 1 Extra Step Time-Varying Gossip Method
Open Source Code No The paper includes a checklist entry stating 'Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes]', but does not provide a direct URL or explicitly state in the main text that the code is in the supplementary material.
Open Datasets Yes We consider the CIFAR-10 [51] dataset containing 60000 images, equally distributed over 10 classes.
Dataset Splits No The paper mentions partitioning the CIFAR-10 dataset into 16 subsets for nodes and states 'equally distributed over 10 classes', but it does not provide specific training, validation, or test splits (e.g., percentages or counts) or details on how these splits were handled for reproduction.
Hardware Specification No We simulate a distributed setup of 16 nodes on two GPUs and use Ray [67]. This mentions 'two GPUs' but does not specify the exact model or type of GPUs used.
Software Dependencies No The paper mentions using 'Ray [67]' and 'Adam [42] as the optimizer', but it does not provide specific version numbers for these or any other key software components used in the experiments.
Experiment Setup Yes We use the same learning rate equal to 0.002 for the generator and discriminator. The rest of the parameters and features of the architecture can be found in the supplementary material.