reproducibilityindex.ai

Disentangled Recurrent Wasserstein Autoencoder

Authors: Jun Han, Martin Renqiang Min, Ligong Han, Li Erran Li, Xuan Zhang

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on a variety of datasets show that our models outperform other baselines with the same settings in terms of disentanglement and unconditional video generation both quantitatively and qualitatively.
Researcher Affiliation	Collaboration	Jun Han PCG, Tencent junhanjh@tencent.com Martin Renqiang Min NEC Laboratories America renqiang@nec-labs.com Ligong Han Rutgers University hanligong@gmail.com Li Erran Li Alexa AI, Amazon erranlli@gmail.com Xuan Zhang Texas A&M University floatlazer@gmail.com
Pseudocode	Yes	Algorithm 1 R-WAE(GAN) and Algorithm 2 R-WAE(MMD) are explicitly provided in Appendix D.
Open Source Code	No	The paper does not include an explicit statement or link confirming the release of its source code for the described methodology.
Open Datasets	Yes	We train our models on Stochastic moving MNIST (SM-MNIST), Sprites, and TIMIT datasets under a completely unsupervised setting. The number of actions (motions) is used as prior information for all methods on MUG facial dataset.
Dataset Splits	Yes	We use 6 variants in each of 4 attribute categories (skin colors, tops, pants and hair style) and there are 64 = 1296 unique characters in total, where 1000 of them are used for training and the rest of them are used for testing. (Sprites Dataset)
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments (e.g., specific GPU or CPU models).
Software Dependencies	No	The paper mentions using 'Adam optimizer (Kingma & Ba, 2015)' but does not specify version numbers for any software dependencies, libraries, or programming languages.
Experiment Setup	Yes	The penalty coefficients β1 and β2, are, respectively, 5 and 20. The learning rate for the decoder model is 5 10 4 and the learning rate for the encoder is 1 10 4. The learning rate for fγ is 1 10 4. The batch size on both SM-MNIST and Sprites dataset are 60 and the length of video sequence for training is T = 8. (Appendix G)