Generalized Variational Inference via Optimal Transport

Authors: Jinjin Chi, Zhichao Zhang, Zhiyao Yang, Jihong Ouyang, Hongbin Pei

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide the consistency analysis of approximate posteriors and demonstrate the practical effectiveness on Bayesian neural networks and variational autoencoders.
Researcher Affiliation Academia Jinjin Chi1,2, Zhichao Zhang1,2, Zhiyao Yang1,2, Jihong Ouyang1,2, Hongbin Pei3* 1 College of Computer Science and Technology, Jilin University, China 2 Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, China 3 MOE KLINNS Lab, School of Cyber Science and Engineering, Xi an Jiaotong University, China
Pseudocode Yes Algorithm 1: Computation of OT distance" and "Algorithm 2: Optimization of VOT
Open Source Code No The paper mentions that 'well-known variational inference methods with different α-divergences ... are implemented using publicly available code' and provides a link (https://github.com/Yingzhen Li/VRbound) for these baselines, and also mentions 'WVI(Ambrogioni et al. 2018), which is implemented upon the publicly available code 5' with link (https://github.com/zqkhan/wvi pytorch). However, it does not provide any statement or link for the source code of their proposed method (VOT).
Open Datasets Yes The linear regression task is performed on seven widely-used benchmark data sets from the UCI dataset repository 3. The statistics of the data sets are shown in Table 1. ... 3http://archive.ics.uci.edu/ml/datasets.html" and "Variational Autoencoder ... MNIST dataset4, a collection of handwritten digits from zero to nine. ... 4http://yann.lecun.com/exdb/mnist/
Dataset Splits Yes Each data set is randomly split into 90% for training and 10% for testing. ... The sizes of training and testing are 60000 and 10000, respectively.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory, or cloud instance types).
Software Dependencies No The paper mentions 'Adam optimizer' and 'automatic differentiation tools (Team 2015)' but does not provide specific version numbers for software libraries or dependencies used in their implementation.
Experiment Setup Yes In all methods, Adam optimizer is employed to adjust the learning rate with parameters β1=0.9, β2=0.999 and α=0.001 (Kingma and Ba 2015). The sample number S is set to 128 and the training epoch is set to 500. The entropy regularization parameter ε is set to 0.1. The constants a and b in λ are set to 2 10 3 and 2 104, respectively.