Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Understanding Entropic Regularization in GANs

Authors: Daria Reshetova, Yikun Bai, Xiugang Wu, Ayfer Özgür

JMLR 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments we aim to contrast and compare the performance of Sinkhorn GAN (label: SGAN) and 1-Wasserstein GAN WGAN (label: WGAN) for linear generators. Entropic W2GAN is omitted from the comparison due to the fact that it leads to a biased solution as shown in Theorem 1. Following the experimental evaluations of Feizi et al. (2017), we generate n = 105 samples from a d = 32 dimensional Gaussian distribution N(0, K) where K is a random positive semi-definite matrix normalized to have Frobenius norm 1. We train WGAN with weight clipping (Arjovsky et al., 2017) labeled WGAN-WC, and WGAN with gradient penalty (Gulrajani et al., 2017) labeled WGAN-GP two common methods to ensure Lipschitzness of the discriminators. We use the linear generator and a neural network discriminator with hyper-parameter settings as recommended by Gulrajani et al. (2017). The discriminator neural network has three hidden layers, each with 64 neurons and Re LU activation functions. The pseudocode of our optimization for Sinkhorn GAN can be found in Algorithm 1. The algorithm is similar to the algorithm of Sanjabi et al. (2018), where we assume that the generators are parametrized by θ, i.e. G(X) = Gθ(x) and we apply stochastic gradient descent on θ. Note that at every step of the gradient descent algorithm, we need to calculate the gradient of the Sinkhorn divergence, θSλ(PGθ(X), PY ).
Researcher Affiliation Academia Daria Reshetova EMAIL Department of Electrical Engineering Stanford University Stanford, CA 94305, USA Yikun Bai EMAIL Department of Electrical and Computer Engineering University of Delaware Newark, DE 19716, USA Xiugang Wu EMAIL Department of Electrical and Computer Engineering University of Delaware Newark, DE 19716, USA Ayfer Ozg ur EMAIL Department of Electrical Engineering Stanford University Stanford, CA 94305, USA
Pseudocode Yes Algorithm 1 SGD for GANs INPUT: PX, P ˆY , λ, S, θ0, step sizes {αt > 0}T 1 t=0 for t = 0, , T 1 do Sample I.I.D. points x1 t, , x S t PX, y1 t , , y S t P ˆY Call the oracle to find ϵ-approximate maximizers (φt, ψt), φx t for the dual formulations (57), (58) Compute µt(Gθ(xi t), yj t ) θ( Gθ(xi t) yj t 2) (61) 2µx t (xi t, xj t) θ( Gθ(xi t) Gθ(xj t) 2) where µt, µx t are computed using (φt, ψt) and φx t based on (59), (60). Update θt+1 θt αtgt end for
Open Source Code No The paper does not contain any explicit statement about making its code open-source, nor does it provide a link to a code repository. It only refers to Algorithm 1 as the pseudocode for its optimization.
Open Datasets No Following the experimental evaluations of Feizi et al. (2017), we generate n = 105 samples from a d = 32 dimensional Gaussian distribution N(0, K) where K is a random positive semi-definite matrix normalized to have Frobenius norm 1.
Dataset Splits No The paper mentions generating n = 105 samples but does not specify any training, validation, or test splits for these samples.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud computing instances used for the experiments.
Software Dependencies No The paper mentions using a 'neural network discriminator' and 'Re LU activation functions' but does not specify any software libraries (e.g., PyTorch, TensorFlow) or their version numbers.
Experiment Setup Yes We train WGAN with weight clipping (Arjovsky et al., 2017) labeled WGAN-WC, and WGAN with gradient penalty (Gulrajani et al., 2017) labeled WGAN-GP two common methods to ensure Lipschitzness of the discriminators. We use the linear generator and a neural network discriminator with hyper-parameter settings as recommended by Gulrajani et al. (2017). The discriminator neural network has three hidden layers, each with 64 neurons and Re LU activation functions. The pseudocode of our optimization for Sinkhorn GAN can be found in Algorithm 1. The algorithm is similar to the algorithm of Sanjabi et al. (2018), where we assume that the generators are parametrized by θ, i.e. G(X) = Gθ(x) and we apply stochastic gradient descent on θ. We run the experiments for 500 epochs with a batch size of 200.