Generalized Natural Gradient Flows in Hidden Convex-Concave Games and GANs

Authors: Andjela Mladenovic, Iosif Sakos, Gauthier Gidel, Georgios Piliouras

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 6 EXPERIMENTAL RESULTSToy multi-dimensional case. As a first experiment, we consider the HCC objective...GANs. For our second experiment, we implement the NHG dynamics to train a GAN. The experiments reveal that our approach provides good performance and convergence guarantees.
Researcher Affiliation Academia Andjela Mladenovic Univ. of Montr eal & Mila Iosif Sakos SUTD Gauthier Gidel Univ. of Montr eal & Mila Georgios Piliouras SUTD
Pseudocode No The paper contains mathematical equations describing dynamics (e.g., (D1), (D2), (D3)) but no explicitly labeled 'Algorithm' or 'Pseudocode' blocks.
Open Source Code Yes The details of both experiments can be found within the source-code files included with this work. and We also include the source code files and the necessary input files in the supplementary material that accompanies this work.
Open Datasets Yes train large models on Imagenet and CIFAR (Martens et al., 2021; Arbel et al., 2020).
Dataset Splits No The paper mentions using a 'synthetic experiment' and notes about 'Imagenet and CIFAR', but does not provide specific training/validation/test splits, percentages, or absolute sample counts for these datasets.
Hardware Specification No The paper does not explicitly describe the specific hardware used (e.g., GPU/CPU models, memory specifications, or cloud instances) for running its experiments.
Software Dependencies No The paper mentions 'scipy package in Python 3' and 'Optuna,' but it does not provide specific version numbers for these software dependencies (e.g., 'scipy 1.x.x', 'Optuna 2.x.x'), nor for Python itself beyond 'Python 3'.
Experiment Setup Yes We adapt K-FAC (Martens & Grosse, 2015) as an approximator to compute NHG, with a learning rate of 10 4, selected using the standard parameter optimizer package Optuna (Akiba et al., 2019). We used gradient clipping with gradient clip to a maximum norm of 1 on, both the discriminator and the generator, to stabilize the optimization. and Regarding the GAN, we consider a Flow-GAN architecture where our generator, G, is a Real NVP consisting of 8 coupling layers (Dinh et al., 2017; 2014), and the discriminator is a 4-layer MLP with 256-128-64-1 output features.