Boundary Seeking GANs
Authors: R Devon Hjelm, Athul Paul Jacob, Adam Trischler, Gerry Che, Kyunghyun Cho, Yoshua Bengio
ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of the proposed algorithm with discrete image and character-based natural language generation. In addition, the boundary-seeking objective extends to continuous data, which can be used to improve stability of training, and we demonstrate this on Celeba, Large-scale Scene Understanding (LSUN) bedrooms, and Imagenet without conditioning. |
| Researcher Affiliation | Collaboration | R Devon Hjelm MILA, University of Montr eal, IVADO erroneus@gmail.comAthul Paul Jacob MILA, MSR, University of Waterloo apjacob@edu.uwaterloo.caTong Che MILA, University of Montr eal tong.che@umontreal.caAdam Trischler MSR adam.trischler@microsoft.comKyunghyun Cho New York University, CIFAR Azrieli Global Scholar kyunghyun.cho@nyu.eduYoshua Bengio MILA, University of Montr eal, CIFAR, IVADO yoshua.bengio@umontreal.ca |
| Pseudocode | Yes | Algorithm 1 . Discrete Boundary Seeking GANs |
| Open Source Code | No | The paper does not provide an explicit statement or link to its own open-source code for the methodology described. |
| Open Datasets | Yes | We first verify the gradient estimator provided by BGAN works quantitatively in the discrete setting by evaluating its ability to train a classifier with the CIFAR-10 dataset (Krizhevsky & Hinton, 2009). ... We tested BGAN using two imaging benchmarks: the common discretized MNIST dataset (Salakhutdinov & Murray, 2008) and a new quantized version of the Celeb A dataset (see Liu et al., 2015, for the original Celeb A dataset). ... Next, we test BGAN in a natural language setting with the 1-billion word dataset (Chelba et al., 2013), modeling at the character-level and limiting the dataset to sentences of at least 32 and truncating to 32 characters. |
| Dataset Splits | Yes | We trained the importance sampling BGAN on the set of f-divergences given in Table 1 as well as the REINFORCE counterpart for 200 epochs and report the accuracy on the test set. ... Final evaluation was done by estimating difference measures using 60000 MNIST training examples againt 60000 samples from each generator, averaged over 12 batches of 5000. We used the training set as this is the distribution over which the discriminators were trained. Test set estimates in general were close and did not diverge from training set distances, indicating the discriminators were not overfitting, but training set estimates were slightly higher on average. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions software like "Theano", "Lasagne", and "Fuel" but does not specify their version numbers, which are necessary for reproducibility. |
| Experiment Setup | Yes | We update the generator for 5 steps for every discriminator step. ... Each model was trained for 300 generator epochs, with the discriminator being updated 5 times per generator update for WGAN-GP and 1 time per generator update for the BGAN models (in other words, the generators were trained for the same number of updates). ... WGAN-GP was trained with a gradient penalty hyper-parameter of 5.0... The BGAN models were trained with the gradient norm penalty of 5.0 (Roth et al., 2017). |