Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Clipped Stochastic Methods for Variational Inequalities with Heavy-Tailed Noise
Authors: Eduard Gorbunov, Marina Danilova, David Dobre, Pavel Dvurechenskii, Alexander Gasnikov, Gauthier Gidel
NeurIPS 2022 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate our theoretical results, we conduct experiments on heavy-tailed min-max problems to demonstrate the importance of clipping when using non-adaptive methods such as SGDA or SEG. We train a Wasserstein GAN with gradient penalty [Gulrajani et al., 2017] on CIFAR-10 [Krizhevsky et al., 2009] using SGDA, clipped-SGDA, and clipped-SEG, and show the evolution of the gradient noise histograms during training. |
| Researcher Affiliation | Academia | Eduard Gorbunov MIPT, Russia Mila & Ude M, Canada MBZUAI, UAE Marina Danilova MIPT, Russia David Dobre Mila & Ude M, Canada Pavel Dvurechensky WIAS, Germany Alexander Gasnikov MIPT, Russia HSE University, Russia IITP RAS, Russia Gauthier Gidel Mila & Ude M, Canada Canada CIFAR AI Chair |
| Pseudocode | Yes | xk+1 = xk γ2 e Fξk 2(exk), where exk = xk γ1 e Fξk 1(xk), (clipped-SEG) e Fξk 1(xk) = clip i=1 Fξi,k 1 (xk), λ1,k , e Fξk 2(exk) = clip i=1 Fξi,k 2 (exk), λ2,k where {ξi,k 1 }m1,k i=1 , {ξi,k 2 }m2,k i=1 are independent samples from the distribution D. |
| Open Source Code | Yes | Our codes are publicly available: https://github.com/busycalibrating/ clipped-stochastic-methods. |
| Open Datasets | Yes | We train a Wasserstein GAN with gradient penalty [Gulrajani et al., 2017] on CIFAR-10 [Krizhevsky et al., 2009]... We train on FFHQ downsampled to 128 128 pixels [Karras et al., 2019]. |
| Dataset Splits | No | The paper mentions training models on CIFAR-10 and FFHQ and conducting hyperparameter sweeps, but it does not explicitly provide percentages or absolute counts for training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU models, CPU specifications, or memory used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers (e.g., 'PyTorch 1.9'). It mentions adapting code from a 'publicly available WGAN-GP implementation' and 'pytorch-gan-collections' but without specific versioning. |
| Experiment Setup | Yes | We use the default architectures and training parameters specified in Gulrajani et al. [2017] (λGP = 10, ndis = 5, learning rate decayed linearly to 0 over 100k steps)... We train on FFHQ downsampled to 128 128 pixels, and use the recommended Style GAN2 hyperparameter configuration for this resolution (batch size = 32, γ = 0.1024, map depth = 2, channel multiplier = 16384). |