Deterministic Langevin Monte Carlo with Normalizing Flows for Bayesian Inference
Authors: Richard Grumitt, Biwei Dai, Uros Seljak
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show on various examples that the method is competitive against state of the art sampling methods. |
| Researcher Affiliation | Academia | Richard D.P. Grumitt Department of Astronomy, Tsinghua University, Beijing 100084, China Biwei Dai Physics Department, University of California Berkeley, CA 94720, USA Uroš Seljak Physics Department, University of California and Lawrence Berkeley National Laboratory Berkeley, CA 94720, USA |
| Pseudocode | Yes | Algorithm 1 Deterministic Langevin Monte Carlo with Normalizing Flows |
| Open Source Code | Yes | Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [Yes] Did you include any new assets either in the supplemental material or as a URL? [Yes] Code included as supplemental material. |
| Open Datasets | Yes | Hierarchical logistic regression with a sparse prior applied to the German credit dataset is a popular benchmark for sampling methods [Dua and Graff, 2017]. German credit data taken from public UCI repository. |
| Dataset Splits | Yes | The number of layers L can be chosen based on cross-validation, where we set aside 20% of the samples, and iterate until validation data start to diverge from the training data. |
| Hardware Specification | No | Typical SINF training time is of order seconds on a CPU. We assume a likelihood gradient cost of 1 minute, and the cost of the NF itself (seconds) is negligible. (No specific CPU models, GPUs, or detailed configurations are provided for running experiments.) |
| Software Dependencies | No | The main baseline we compare against is the No-U-Turn Sampler (NUTS) [Hoffman et al., 2014], an adaptive HMC variant implemented in the Num Pyro library [Phan et al., 2019]. For NF we use SINF, which has very few hyper-parameters [Dai and Seljak, 2021], is fast, and iterative. (No specific software versions are provided.) |
| Experiment Setup | Yes | The number of layers L can be chosen based on cross-validation, where we set aside 20% of the samples, and iterate until validation data start to diverge from the training data. However, for the d = 1000 Gaussian (Section 5.5) we fix L = 5. At each iteration, we take Adagrad updates in the (U(x(t)) V (x(t))) direction [Duchi et al., 2011]. We use learning rates between 0.001 and 0.1, with smaller learning rates being more robust for targets with complicated geometries such as funnel distributions. Where we include NUTS as a baseline, we use 500 tuning steps. |