Sobolev GAN
Authors: Youssef Mroueh, Chun-Liang Li, Tom Sercu, Anant Raj, Yu Cheng
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically study Sobolev GAN in character level text generation (Section 6.1). We finally show in Section 6.2 that a variant of Sobolev GAN achieves competitive semisupervised learning results on CIFAR-10, thanks to the smoothness enforced on the critic by Sobolev GAN which relates to Laplacian regularization. |
| Researcher Affiliation | Collaboration | Youssef Mroueh , Chun-Liang Li , , Tom Sercu , , Anant Raj , & Yu Cheng IBM Research AI Carnegie Mellon University Max Planck Institute for Intelligent Systems denotes Equal Contribution {mroueh,chengyu}@us.ibm.com, chunlial@cs.cmu.edu, tom.sercu1@ibm.com,anant.raj@tuebingen.mpg.de |
| Pseudocode | Yes | Algorithm 1 Sobolev GAN |
| Open Source Code | Yes | Code for semi-supervised learning experiments is available on https://github.com/tomsercu/SobolevGAN-SSL |
| Open Datasets | Yes | semi-supervised learning on CIFAR-10 |
| Dataset Splits | No | We do hyperparameter and model selection on the validation set. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running experiments were mentioned. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., library names with explicit version tags) were mentioned. |
| Experiment Setup | Yes | We use Adam with learning rate η = 2e 4, β1 = 0.5 and β2 = 0.999, both for critic f (without BN) and Generator (with BN). We train all models for 350 epochs. We used some L2 weight decay: 1e 6 on ω, S (i.e. all layers except last) and 1e 3 weight decay on the last layer v. For formulation 1 (Fisher only) we have ρF = 1e 7, modified critic learning rate ηD = 1e 4, critic iters nc = 2. For formulation 2 (Sobolev + Fisher) we have ρF = 5e 8, ρS = 2e 8, critic iters nc = 1. For the WGAN-GP (Gulrajani et al., 2017) baseline SSL experiment we followed the original paper with critic iters nc = 5, ηG = ηD = 1e 4, Adam β2=0.9 and GP weight λGP = 10.0. The noise level σ was annealed following a linear schedule starting from an initial noise level σ0 (at iteration i, σi = σ0(1 i Maxiter), Maxiter=30K). |