Extra-gradient with player sampling for faster convergence in n-player games
Authors: Samy Jelassi, Carles Domingo-Enrich, Damien Scieur, Arthur Mensch, Joan Bruna
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we first validate that DSEG is faster in massive differentiable convex games with noisy gradient oracles. We further show that non-random player selection improves convergence speed, and provide explanations for this phenomenon. In practical non-convex settings, we find that cyclic player sampling improves the speed and performance of GAN training (CIFAR10, Res Net architecture). |
| Researcher Affiliation | Collaboration | Samy Jelassi * 1 Carles Domingo-Enrich * 2 Damien Scieur 3 Arthur Mensch 4 2 Joan Bruna 2 ... 1Princeton University, USA 2NYU CIMS, New York, USA 3Samsung SAIT AI Lab, Montreal, Canada 4ENS, DMA, Paris, France. |
| Pseudocode | Yes | Algorithm 1 Doubly-stochastic extra-gradient. ... Algorithm 2 Variance reduced estimate of the simultaneous gradient with doubly-stochastic sampling |
| Open Source Code | Yes | A Py Torch/Numpy package is attached. |
| Open Datasets | Yes | We evaluate the performance of the player sampling approach to train a generative model on CIFAR10 (Krizhevsky & Hinton, 2009). |
| Dataset Splits | No | No explicit information on training, validation, or test dataset splits (e.g., 80/10/10 percentages or specific sample counts) was found in the paper. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, cloud instances with specifications) used for running experiments were found in the paper. |
| Software Dependencies | No | A Py Torch/Numpy package is attached. |
| Experiment Setup | Yes | We selected all hyperparameters (stepsize and batch size for the number of sampled players) through a grid search for each experimental setting... Learning rates were selected by grid searching over {1e−5, 3e−5, 5e−5, 1e−4, 3e−4, 5e−4} for each of the methods. |