Sobolev GAN

Authors: Youssef Mroueh, Chun-Liang Li, Tom Sercu, Anant Raj, Yu Cheng

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically study Sobolev GAN in character level text generation (Section 6.1). We finally show in Section 6.2 that a variant of Sobolev GAN achieves competitive semisupervised learning results on CIFAR-10, thanks to the smoothness enforced on the critic by Sobolev GAN which relates to Laplacian regularization.
Researcher Affiliation Collaboration Youssef Mroueh , Chun-Liang Li , , Tom Sercu , , Anant Raj , & Yu Cheng IBM Research AI Carnegie Mellon University Max Planck Institute for Intelligent Systems denotes Equal Contribution {mroueh,chengyu}@us.ibm.com, chunlial@cs.cmu.edu, tom.sercu1@ibm.com,anant.raj@tuebingen.mpg.de
Pseudocode Yes Algorithm 1 Sobolev GAN
Open Source Code Yes Code for semi-supervised learning experiments is available on https://github.com/tomsercu/SobolevGAN-SSL
Open Datasets Yes semi-supervised learning on CIFAR-10
Dataset Splits No We do hyperparameter and model selection on the validation set.
Hardware Specification No No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned.
Software Dependencies No No specific software dependencies with version numbers (e.g., library names with explicit version tags) were mentioned.
Experiment Setup Yes We use Adam with learning rate η = 2e 4, β1 = 0.5 and β2 = 0.999, both for critic f (without BN) and Generator (with BN). We train all models for 350 epochs. We used some L2 weight decay: 1e 6 on ω, S (i.e. all layers except last) and 1e 3 weight decay on the last layer v. For formulation 1 (Fisher only) we have ρF = 1e 7, modified critic learning rate ηD = 1e 4, critic iters nc = 2. For formulation 2 (Sobolev + Fisher) we have ρF = 5e 8, ρS = 2e 8, critic iters nc = 1. For the WGAN-GP (Gulrajani et al., 2017) baseline SSL experiment we followed the original paper with critic iters nc = 5, ηG = ηD = 1e 4, Adam β2=0.9 and GP weight λGP = 10.0. The noise level σ was annealed following a linear schedule starting from an initial noise level σ0 (at iteration i, σi = σ0(1 i Maxiter), Maxiter=30K).