Exponential Family Estimation via Adversarial Dynamics Embedding
Authors: Bo Dai, Zhen Liu, Hanjun Dai, Niao He, Arthur Gretton, Le Song, Dale Schuurmans
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | An empirical investigation shows that adapting the sampler during MLE can significantly improve on state-of-the-art estimators1. In this section, we test ADE on several synthetic datasets in Section 5.1 and real-world image datasets in Section 5.2. |
| Researcher Affiliation | Collaboration | 1Google Research, Brain Team, 2Mila, University of Montreal, 3University of Illinois at Urbana Champaign, 4University College London, 5Georgia Institute of Technology, 6Ant Financial, 7University of Alberta |
| Pseudocode | Yes | Algorithm 1 MLE via Adversarial Dynamics Embedding (ADE) |
| Open Source Code | Yes | 1The code repository is available at https://github.com/lzzcd001/ade-code. |
| Open Datasets | Yes | We apply ADE to MNIST and CIFAR-10 data. |
| Dataset Splits | No | The paper refers to an appendix for experiment details (Appendix F.1 and F.2), but the provided text does not contain explicit dataset split information (percentages, counts, or specific predefined splits). |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper does not specify any software dependencies with version numbers. |
| Experiment Setup | Yes | We keep the model sizes the same in NF and ADE (10 planar layers). Then we perform 5-steps stochastic Langevin steps to obtain the final samples x T with standard Gaussian noise in each step, and without incurring extra memory cost. For fairness, we conduct CD with 15 steps. This setup is preferable to CD with an extra acceptance-rejection step. ... In both cases, we use a CNN architecture for the discriminator, following Miyato et al. (2018), with spectral normalization added to the discriminator layers. In particular, for the discriminator in the CIFAR-10 experiments, we replace all downsampling operations by average pooling, as in Du and Mordatch (2019). ... The output sample is clipped to [0, 1] after each HMC step and the Deep LVM initialization. |