Statistically Optimal Generative Modeling with Maximum Deviation from the Empirical Distribution
Authors: Elen Vardanyan, Sona Hunanyan, Tigran Galstyan, Arshak Minasyan, Arnak S. Dalalyan
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our implementation follows the pseudo-code presented in Algorithm 1, and is inspired by the code accompanying (Gulrajani et al., 2017), where WGANs with gradient penalty on the discriminator/critic network is discussed. We add the LIPERM penalization to the objective function of the WGANs. All the functional classes F0, G0, and H0 are represented by neural networks, the architectures of which are presented in the supplementary material. In all our experiments, we chose ncritic = 5 and γ = 1. We conducted experiments on three widely used datasets: Swiss Roll, MNIST and CIFAR 10. The results are briefly summarized in this section. |
| Researcher Affiliation | Collaboration | 1Department of Mathematics, Yerevan State University (YSU), Armenia 2Yereva NN, Armenia 3CREST, GENES, Institut Polytechnique de Paris, France. Correspondence to: Elen Vardanyan <evardanyan@aua.am>, Arnak Dalalyan <arnak.dalalyan@ensae.fr>. |
| Pseudocode | Yes | Algorithm 1 WGAN-LIPERM. We take λ {0, 1, 4, 8}, Require: LIP coefficient λ, gradient penalty coefficient γ, number of iterations Niter, number of critic iterations per generator iteration ncritic, batch size m Require: initial critic and generator parameters (w0, θ0), initial left inverse network parameters ϕ0, k = 0. 1: repeat 2: k k + 1 3: for t = 1, ..., ncritic do 4: for i = 1, ..., m do 5: Draw x Pn,X (true examples) 6: Draw u Ud (latent variables) 7: Draw ϵ U[0, 1] 8: x Gθ(u) (generated examples) 9: bx ϵx + (1 ϵ) x 10: Ld Fw( x) Fw(x) 11: L(i) d Ld + γ( bx Fw(bx) 2 1)2 12: end for 13: w Adam( w 1 m Pm i=1 L(i) d , {w}) 14: end for 15: for i = 1, ..., m do 16: Draw u Ud 17: L(i) g Fw(Gθ(u)) + λ Hϕ(Gθ(u)) u 2 18: end for 19: (θ, ϕ) Adam( θ,ϕ 1 m Pm i=1 L(i) g , {(θ, ϕ)}) 20: until k > Niter |
| Open Source Code | Yes | Our code uses the framework of (Varuna Jayasiri, 2020) and is available here. |
| Open Datasets | Yes | We conducted experiments on three widely used datasets: Swiss Roll, MNIST and CIFAR 10. |
| Dataset Splits | No | The paper mentions training, but does not explicitly provide details about training/validation/test splits, such as percentages or sample counts for each split. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions using a framework and provides code, but does not list specific software dependencies with their version numbers. |
| Experiment Setup | Yes | Our implementation follows the pseudo-code presented in Algorithm 1, and is inspired by the code accompanying (Gulrajani et al., 2017), where WGANs with gradient penalty on the discriminator/critic network is discussed. We add the LIPERM penalization to the objective function of the WGANs. All the functional classes F0, G0, and H0 are represented by neural networks, the architectures of which are presented in the supplementary material. In all our experiments, we chose ncritic = 5 and γ = 1. |