Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
MixFlows: principled variational inference via mixed flows
Authors: Zuheng Xu, Naitong Chen, Trevor Campbell
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Simulated and real data experiments show that Mix Flows can provide more reliable posterior approximations than several black-box normalizing flows, as well as samples of comparable quality to those obtained from state-of-the-art MCMC methods. |
| Researcher Affiliation | Academia | 1University of British Columbia, Department of Statistics, Vancouver, Canada. |
| Pseudocode | Yes | Algorithm 1 Sample(qλ,N): Take a draw from qλ,N; Algorithm 2 log qλ,N(x): Evaluate the log-density of qλ,N; Algorithm 3 Est ELBO(λ, N): Obtain an unbiased estimate of the ELBO for qλ,N. |
| Open Source Code | Yes | Code is available at https://github.com/zuhengxu/Ergodic-variational-flow-code. |
| Open Datasets | Yes | For linear regression problem with a normal prior, we use the Boston housing prices dataset (Harrison Jr. & Rubinfeld, 1978). For linear regression with a heavy-tail prior, we use the communities and crime dataset (U.S. Department of Justice Federal Bureau of Investigation, 1995). For logistic regression, we use a bank marketing dataset (Moro et al., 2014). For Poisson regression problem, we use an airport delays dataset... We also consider a Bayesian Student-t regression problem... In this example, we use the creatinine dataset (Liu & Rubin, 1995). Finally, we compare the methods on the Bayesian sparse regression problem applied to two datasets: a prostate cancer dataset... and a superconductivity dataset (Hamidieh, 2018). |
| Dataset Splits | No | The paper does not explicitly describe train/validation/test dataset splits. It mentions using samples for KSD estimation and training iterations for NUTS and NFs, but not a partitioning of the datasets into these specific subsets for model development and evaluation. |
| Hardware Specification | Yes | All experiments were conducted on a machine with an AMD Ryzen 9 3900X and 32GB of RAM. |
| Software Dependencies | No | The paper mentions software like 'Julia package Advanced HMC.jl' and 'Julia package Bijectors.jl'. However, it does not specify exact version numbers for these software packages or the Julia programming language itself, which are necessary for full reproducibility. |
| Experiment Setup | Yes | For all experiments, unless otherwise stated, NUTS uses 20,000 steps for adaptation, targeting at an average acceptance ratio 0.7, and generates 5,000 samples for KSD estimation. The KSD for Mix Flow is estimated using 2,000 samples. For all three examples, we used the leapfrog stepsize ϵ = 0.05 and run L = 50 leapfrogs between each refreshment. We train NF using ADAM until convergence (100, 000 iterations except where otherwise noted) with the initial step size set to 0.001. |