reproducibilityindex.ai

K-Beam Minimax: Efficient Optimization for Deep Adversarial Learning

Authors: Jihun Hamm, Yung-Kyun Noh

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To demonstrate the advantages of the algorithm, we test the algorithm on the toy surfaces (Fig. 1) for which we know the true minimax solutions. For real-world demonstrations, we also test the algorithm on GAN problems (Goodfellow et al., 2014), and unsupervised domain-adaptation problems (Ganin & Lempitsky, 2015). Examples were chosen so that the performance can be measured objectively by the Jensen-Shannon divergence for GAN and by cross-domain classiﬁcation error for domain adaptation. Evaluations show that the proposed K-beam subgradient-descent approach can signiﬁcantly improve stability and convergence speed of minimax optimization.
Researcher Affiliation	Academia	1The Ohio State University, Columbus, OH, USA. 2Seoul National University, Seoul, Korea.
Pseudocode	Yes	Algorithm 1 K-beam ϵ-subgradient descent
Open Source Code	Yes	The codes for the project can be found at https://github.com/ jihunhamm/k-beam-minimax.
Open Datasets	Yes	We train GANs with the proposed algorithm to learn a generative model of two-dimensional mixtures of Gaussians (Mo Gs). Let x be a sample from the Mo G with the density p(x) = 1/7 P6 i=0 N (sin(πi/4), cos(πi/4)), (0.01)2I2, and z be a sample from the 256-dimensional Gaussian distribution N(0, I256).
Dataset Splits	No	The paper describes the datasets used (Mo Gs, MNIST/MNIST-M) but does not provide specific train/validation/test dataset splits by percentage, count, or a reference to a standard split definition.
Hardware Specification	Yes	We measure the runtime of the algorithms by wall clock on the same system using a single NVIDIA GTX980 4GB GPU with a single Intel Core i7-2600 CPU.
Software Dependencies	No	The paper mentions optimizers used (Adam optimizer, momentum optimizer) but does not provide specific software library names with version numbers (e.g., PyTorch 1.9, TensorFlow 2.x).
Experiment Setup	Yes	Both G and D are two-layer tanh networks with 128 hidden units per layer, trained with Adam optimizer with batch size 128 and the learning rate of 10-4 for the discriminator and 10-3 for the generator.