Multimodal Adversarially Learned Inference with Factorized Discriminators

Authors: Wenxue Chen, Jianke Zhu6304-6312

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have conducted experiments on the benchmark datasets, whose promising results show that our proposed approach outperforms the-state-of-the-art methods on a variety of metrics.
Researcher Affiliation Collaboration 1 Zhejiang University 2 Alibaba-Zhejiang University Joint Institute of Frontier Technologies {wxchern, jkzhu}@zju.edu.cn
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes The source code is publicly available at https://github.com/6b5d/mmali.
Open Datasets Yes We have conducted experiments on the benchmark datasets...Multi MNIST dataset...MNIST-SVHN dataset...Caltech-UCSD Birds (CUB) dataset (Welinder et al. 2010)
Dataset Splits No The paper references benchmark datasets (Multi MNIST, MNIST-SVHN, CUB) but does not explicitly provide the specific percentages or sample counts for training, validation, and test splits, nor does it specify which standard splits (if any) are used for these datasets.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, memory, or cloud computing specifications used for running the experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library names with version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes Across all the experiments, we use Adam optimizer (Kingma and Ba 2015) with the learning rate 0.0002, in which all models are trained for 250, 000 iterations with the batch size 64. We also employ an exponential moving average (Yazici et al. 2019) of the weights for both encoders and decoders with a decay rate 0.9999 and start at the 50, 000-th iteration. In all models, the standard Gaussian N(0, I) is chosen as the prior, and the isotropic Gaussian N(µ, σ2I) is chosen as the posterior. We make use of the non-saturating loss described in Dandi et al. (2021) to update the encoders and decoders.