Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution

Authors: Elen Vardanyan, Sona Hunanyan, Tigran Galstyan, Arshak Minasyan, Arnak S. Dalalyan

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our implementation follows the pseudo-code presented in Algorithm 1, and is inspired by the code accompanying (Gulrajani et al., 2017), where WGANs with gradient penalty on the discriminator/critic network is discussed. We add the LIPERM penalization to the objective function of the WGANs. All the functional classes F0, G0, and H0 are represented by neural networks, the architectures of which are presented in the supplementary material. In all our experiments, we chose ncritic = 5 and γ = 1. We conducted experiments on three widely used datasets: Swiss Roll, MNIST and CIFAR 10. The results are briefly summarized in this section.
Researcher Affiliation Collaboration 1Department of Mathematics, Yerevan State University (YSU), Armenia 2Yereva NN, Armenia 3CREST, GENES, Institut Polytechnique de Paris, France. Correspondence to: Elen Vardanyan <evardanyan@aua.am>, Arnak Dalalyan <arnak.dalalyan@ensae.fr>.
Pseudocode Yes Algorithm 1 WGAN-LIPERM. We take λ {0, 1, 4, 8}, Require: LIP coefficient λ, gradient penalty coefficient γ, number of iterations Niter, number of critic iterations per generator iteration ncritic, batch size m Require: initial critic and generator parameters (w0, θ0), initial left inverse network parameters ϕ0, k = 0. 1: repeat 2: k k + 1 3: for t = 1, ..., ncritic do 4: for i = 1, ..., m do 5: Draw x Pn,X (true examples) 6: Draw u Ud (latent variables) 7: Draw ϵ U[0, 1] 8: x Gθ(u) (generated examples) 9: bx ϵx + (1 ϵ) x 10: Ld Fw( x) Fw(x) 11: L(i) d Ld + γ( bx Fw(bx) 2 1)2 12: end for 13: w Adam( w 1 m Pm i=1 L(i) d , {w}) 14: end for 15: for i = 1, ..., m do 16: Draw u Ud 17: L(i) g Fw(Gθ(u)) + λ Hϕ(Gθ(u)) u 2 18: end for 19: (θ, ϕ) Adam( θ,ϕ 1 m Pm i=1 L(i) g , {(θ, ϕ)}) 20: until k > Niter
Open Source Code Yes Our code uses the framework of (Varuna Jayasiri, 2020) and is available here.
Open Datasets Yes We conducted experiments on three widely used datasets: Swiss Roll, MNIST and CIFAR 10.
Dataset Splits No The paper mentions training, but does not explicitly provide details about training/validation/test splits, such as percentages or sample counts for each split.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper mentions using a framework and provides code, but does not list specific software dependencies with their version numbers.
Experiment Setup Yes Our implementation follows the pseudo-code presented in Algorithm 1, and is inspired by the code accompanying (Gulrajani et al., 2017), where WGANs with gradient penalty on the discriminator/critic network is discussed. We add the LIPERM penalization to the objective function of the WGANs. All the functional classes F0, G0, and H0 are represented by neural networks, the architectures of which are presented in the supplementary material. In all our experiments, we chose ncritic = 5 and γ = 1.