Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Asymptotically exact variational flows via involutive MCMC kernels

Authors: Zuheng (David) Xu, Trevor Campbell

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Experiments This section presents an empirical evaluation of the four proposed ﬂows three IRF variants and homogeneous Mix Flows (collectively referred to as IRF ﬂows since homogeneous Mix Flows can be viewed as a special case). We compare them against two normalizing ﬂows, Real NVP [19] and Neural Spline Flow (NSF) [71], and against the No-U-Turn Sampler (NUTS) [72].
Researcher Affiliation	Academia	Zuheng Xu Trevor Campbell Department of Statistics University of British Columbia [zuheng.xu \| trevor]@stat.ubc.ca
Pseudocode	Yes	The detailed transition procedure of involutive MCMC is described in Algorithm 1 of Appendix A.2. Consider an auxiliary variable v deﬁned on a space V, with conditional density ρ(v \| x) given x X with respect to a base measure mv on V, and the augmented target density π(x, v) := π(x)ρ(v\|x). Let m := m mv be the joint base measure on X V. For an involution g:X V X V, each transition from state x proceeds in three steps:
Open Source Code	Yes	Code for reproducing the main experimental results is available at: https://github.com/zuhengxu/Mix Flow.jl.git.
Open Datasets	Yes	Our synthetic experiments consist of four 2-dimensional targets used by Xu et al. [39]: the Banana [76], Neal s funnel [77], a cross-shaped Gaussian mixture, and a warped Gaussian distribution. ... and a latent Brownian motion model (Brownian; 32-dimensional) and the Log-Gaussian Cox process model (LGCP; 1600-dimensional) from the Inference Gym library [79].
Dataset Splits	No	The paper describes training procedures and evaluation against ground truth or benchmarks, but does not explicitly provide training/test/validation split percentages or sample counts for the datasets used.
Hardware Specification	Yes	Experiments are conducted on the following platforms: a local machine equipped with an AMD Ryzen 9 5900X CPU and 64 GB of RAM, the ARC Sockeye computing platform at the University of British Columbia, and the high-performance compute cluster provided by the Digital Research Alliance of Canada.
Software Dependencies	No	For NUTS benchmarks, we use the Julia package Advanced HMC.jl [84] with default settings throughout. The paper mentions the use of Adam optimizer for training but does not specify its version or the versions of other software libraries used for its own implementation.
Experiment Setup	Yes	All IRF ﬂows start from the same reference distribution q0: a mean-ﬁeld Gaussian trained for 10K Adam steps with batch size 10 and learning rate 10 3. All IRF ﬂows are evaluated with 64 i.i.d. draws, while normalizing ﬂows use 1024. ... For Neural Spline Flows (NSF), we set the spline bandwidth to B = 30, and used K = 11 knots. ... Each normalizing ﬂow is trained via 50,000 Adam steps of batch size 32; we grid-search both the learning rates {10 4, 10 3, 10 2} and ﬂow layers {6, 10} ... All IRF variants use RWMH kernel, with the step size tuned to achieve a 0.8 acceptance rate using bisection search between 0.001 and 10. ... We set T = 5000 for the backward IRF and homogeneous Mix Flow and ensemble IRF Mix Flow, and set T = 4000 for the IRF Mix Flow.