Exponential Family Estimation via Adversarial Dynamics Embedding

Authors: Bo Dai, Zhen Liu, Hanjun Dai, Niao He, Arthur Gretton, Le Song, Dale Schuurmans

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental An empirical investigation shows that adapting the sampler during MLE can significantly improve on state-of-the-art estimators1. In this section, we test ADE on several synthetic datasets in Section 5.1 and real-world image datasets in Section 5.2.
Researcher Affiliation Collaboration 1Google Research, Brain Team, 2Mila, University of Montreal, 3University of Illinois at Urbana Champaign, 4University College London, 5Georgia Institute of Technology, 6Ant Financial, 7University of Alberta
Pseudocode Yes Algorithm 1 MLE via Adversarial Dynamics Embedding (ADE)
Open Source Code Yes 1The code repository is available at https://github.com/lzzcd001/ade-code.
Open Datasets Yes We apply ADE to MNIST and CIFAR-10 data.
Dataset Splits No The paper refers to an appendix for experiment details (Appendix F.1 and F.2), but the provided text does not contain explicit dataset split information (percentages, counts, or specific predefined splits).
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes We keep the model sizes the same in NF and ADE (10 planar layers). Then we perform 5-steps stochastic Langevin steps to obtain the final samples x T with standard Gaussian noise in each step, and without incurring extra memory cost. For fairness, we conduct CD with 15 steps. This setup is preferable to CD with an extra acceptance-rejection step. ... In both cases, we use a CNN architecture for the discriminator, following Miyato et al. (2018), with spectral normalization added to the discriminator layers. In particular, for the discriminator in the CIFAR-10 experiments, we replace all downsampling operations by average pooling, as in Du and Mordatch (2019). ... The output sample is clipped to [0, 1] after each HMC step and the Deep LVM initialization.