Generalized Natural Gradient Flows in Hidden Convex-Concave Games and GANs
Authors: Andjela Mladenovic, Iosif Sakos, Gauthier Gidel, Georgios Piliouras
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 6 EXPERIMENTAL RESULTSToy multi-dimensional case. As a first experiment, we consider the HCC objective...GANs. For our second experiment, we implement the NHG dynamics to train a GAN. The experiments reveal that our approach provides good performance and convergence guarantees. |
| Researcher Affiliation | Academia | Andjela Mladenovic Univ. of Montr eal & Mila Iosif Sakos SUTD Gauthier Gidel Univ. of Montr eal & Mila Georgios Piliouras SUTD |
| Pseudocode | No | The paper contains mathematical equations describing dynamics (e.g., (D1), (D2), (D3)) but no explicitly labeled 'Algorithm' or 'Pseudocode' blocks. |
| Open Source Code | Yes | The details of both experiments can be found within the source-code files included with this work. and We also include the source code files and the necessary input files in the supplementary material that accompanies this work. |
| Open Datasets | Yes | train large models on Imagenet and CIFAR (Martens et al., 2021; Arbel et al., 2020). |
| Dataset Splits | No | The paper mentions using a 'synthetic experiment' and notes about 'Imagenet and CIFAR', but does not provide specific training/validation/test splits, percentages, or absolute sample counts for these datasets. |
| Hardware Specification | No | The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory specifications, or cloud instances) for running its experiments. |
| Software Dependencies | No | The paper mentions 'scipy package in Python 3' and 'Optuna,' but it does not provide specific version numbers for these software dependencies (e.g., 'scipy 1.x.x', 'Optuna 2.x.x'), nor for Python itself beyond 'Python 3'. |
| Experiment Setup | Yes | We adapt K-FAC (Martens & Grosse, 2015) as an approximator to compute NHG, with a learning rate of 10 4, selected using the standard parameter optimizer package Optuna (Akiba et al., 2019). We used gradient clipping with gradient clip to a maximum norm of 1 on, both the discriminator and the generator, to stabilize the optimization. and Regarding the GAN, we consider a Flow-GAN architecture where our generator, G, is a Real NVP consisting of 8 coupling layers (Dinh et al., 2017; 2014), and the discriminator is a 4-layer MLP with 256-128-64-1 output features. |