reproducibilityindex.ai

Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise

Authors: Eduard Gorbunov, Marina Danilova, David Dobre, Pavel Dvurechenskii, Alexander Gasnikov, Gauthier Gidel

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To validate our theoretical results, we conduct experiments on heavy-tailed min-max problems to demonstrate the importance of clipping when using non-adaptive methods such as SGDA or SEG. We train a Wasserstein GAN with gradient penalty [Gulrajani et al., 2017] on CIFAR-10 [Krizhevsky et al., 2009] using SGDA, clipped-SGDA, and clipped-SEG, and show the evolution of the gradient noise histograms during training.
Researcher Affiliation	Academia	Eduard Gorbunov MIPT, Russia Mila & Ude M, Canada MBZUAI, UAE Marina Danilova MIPT, Russia David Dobre Mila & Ude M, Canada Pavel Dvurechensky WIAS, Germany Alexander Gasnikov MIPT, Russia HSE University, Russia IITP RAS, Russia Gauthier Gidel Mila & Ude M, Canada Canada CIFAR AI Chair
Pseudocode	Yes	xk+1 = xk γ2 e Fξk 2(exk), where exk = xk γ1 e Fξk 1(xk), (clipped-SEG) e Fξk 1(xk) = clip i=1 Fξi,k 1 (xk), λ1,k , e Fξk 2(exk) = clip i=1 Fξi,k 2 (exk), λ2,k where {ξi,k 1 }m1,k i=1 , {ξi,k 2 }m2,k i=1 are independent samples from the distribution D.
Open Source Code	Yes	Our codes are publicly available: https://github.com/busycalibrating/ clipped-stochastic-methods.
Open Datasets	Yes	We train a Wasserstein GAN with gradient penalty [Gulrajani et al., 2017] on CIFAR-10 [Krizhevsky et al., 2009]... We train on FFHQ downsampled to 128 128 pixels [Karras et al., 2019].
Dataset Splits	No	The paper mentions training models on CIFAR-10 and FFHQ and conducting hyperparameter sweeps, but it does not explicitly provide percentages or absolute counts for training, validation, or test dataset splits.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU models, CPU specifications, or memory used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependency details, such as library names with version numbers (e.g., 'PyTorch 1.9'). It mentions adapting code from a 'publicly available WGAN-GP implementation' and 'pytorch-gan-collections' but without specific versioning.
Experiment Setup	Yes	We use the default architectures and training parameters specified in Gulrajani et al. [2017] (λGP = 10, ndis = 5, learning rate decayed linearly to 0 over 100k steps)... We train on FFHQ downsampled to 128 128 pixels, and use the recommended Style GAN2 hyperparameter configuration for this resolution (batch size = 32, γ = 0.1024, map depth = 2, channel multiplier = 16384).