Generative Particle Variational Inference via Estimation of Functional Gradients

Authors: Neale Ratzlaff, Qinxun Bai, Li Fuxin, Wei Xu

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through carefully constructed experiments, we show that GPVI outperforms previous generative Par VI methods such as amortized SVGD, and is competitive with Par VI as well as gold-standard approaches like Hamiltonian Monte Carlo for fitting both exactly known and intractable target distributions.
Researcher Affiliation Collaboration 1Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, Oregon 2Horizon Robotics, Cupertino, California.
Pseudocode Yes Algorithm 1 Generative Particle VI (GPVI)
Open Source Code No The paper does not contain an explicit statement about releasing the source code or a link to a code repository.
Open Datasets Yes We evaluated on the MNIST and CIFAR-10 image datasets, following Neal et al. (2018) to split each dataset into 6 inlier classes and 4 outlier classes.
Dataset Splits Yes We evaluated on the MNIST and CIFAR-10 image datasets, following Neal et al. (2018) to split each dataset into 6 inlier classes and 4 outlier classes. We further split the dataset by only using the first six classes for training and testing. The remaining four classes are only used to compute the AUC and ECE statistics.
Hardware Specification No No specific hardware details (e.g., GPU models, CPU models, or memory specifications) used for running experiments were mentioned in the paper.
Software Dependencies No The paper mentions using the Adam optimizer but does not provide specific version numbers for any software dependencies like programming languages or libraries.
Experiment Setup Yes The details of our experimental setup are as follows. In our BNN experiments, we parameterized samples from the target distribution as neural networks with a fixed architecture. [...] For GPVI and amortized Par VI methods we used a 3 layer MLP hypernetwork with layer widths [256, 512, 1024], ReLU activations, and input noise z R256. We chose the Le Net-5 classifier architecture for all models, and trained for 100 epochs.