reproducibilityindex.ai

Greedy inference with structure-exploiting lazy maps

Authors: Michael Brennan, Daniele Bigoni, Olivier Zahm, Alessio Spantini, Youssef Marzouk

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Numerical examples We present numerical demonstrations of the lazy framework as follows. We ﬁrst illustrate Algorithm 2 on a 2-dimensional toy example, where we show the progressive Gaussianization of the posterior using a sequence of 1-dimensional lazy maps. We then demonstrate the beneﬁts of the lazy framework (Algorithms 1 and 2) in several challenging inference problems. We consider Bayesian logistic regression and a Bayesian neural network, and compare the performance of a baseline transport map to lazy maps using the same underlying transport class. We measure performance improvements in four ways: (1) the ﬁnal ELBO achieved by the transport maps after training; (2 and 3): the ﬁnal trace diagnostics 1 2 Tr(HB ℓ) and 1 2 Tr(Hℓ), which bound the error DKL(π\|\|(Tℓ) ρ); and (4) the variance diagnostic 1 2Vρ[log ρ/T ℓπ], which is an asymptotic approximation of DKL((Tℓ) ρ\|\|π) as (Tℓ) ρ π (see [40]). Finally, we highlight the advantages of greedily training lazy maps in a nonlinear problem deﬁned by a high-dimensional elliptic partial differential equation (PDE), often used for testing high-dimensional inference methods [4, 16, 53].
Researcher Affiliation	Academia	Michael C. Brennan Massachusetts Institute of Technology Cambridge, MA 02139 USA mcbrenn@mit.edu Daniele Bigoni Massachusetts Institute of Technology Cambridge, MA 02139 USA dabi@mit.edu Olivier Zahm Université Grenoble Alpes, INRIA, CNRS, LJK 38000 Grenoble, France olivier.zahm@inria.fr Alessio Spantini Massachusetts Institute of Technology Cambridge, MA 02139 USA alessio.spantini@gmail.com Youssef Marzouk Massachusetts Institute of Technology Cambridge, MA 02139 USA ymarz@mit.edu
Pseudocode	Yes	Algorithm 1 Construction of a lazy map. [...] Algorithm 2 Construction of a deeply lazy map
Open Source Code	Yes	4Code for the numerical examples can be found at https://github.com/Michael CBrennan/lazymaps and http://bit.ly/2Qlel XF.
Open Datasets	Yes	We consider a high-dimensional Bayesian logistic regression problem using the UCI Parkinson s disease classiﬁcation data [1], studied in [49]. [...] UCI yacht hydrodynamics data set [2]. [...] Data for 4.4, G.4, and G.5 can be downloaded at http://bit.ly/2X09Ns8, http://bit.ly/2Hyt Qc0 and http://bit.ly/2Eug5ZR.
Dataset Splits	No	The paper mentions using specific datasets but does not provide explicit details on training, validation, or test splits (e.g., percentages or sample counts).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments, only mentioning the software frameworks used.
Software Dependencies	No	The paper mentions software like 'Transport Maps framework [7]', 'Tensor Flow probability library [19]', 'FEni CS [37]', and 'dolfin-adjoint [22]' but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	We consider degree 3 polynomial maps as the underlying transport class. We use Gauss quadrature rules of order 10 for the discretization of the KL divergence and the approximation of HB ℓ(m = 121 in Algorithm 3 and 5). [...] We choose a relatively uninformative prior of N(0, 102Id). [...] In G3-IAF, each layer has rank r = 200. [...] Our inference problem is 581-dimensional, given a network input dimension of 6, one hidden layer of dimension 20, and an output layer of dimension 1. We use sigmoid activations in the input and hidden layer, and a linear output layer. Model parameters are endowed with independent Gaussian priors with zero mean and variance 100. [...] Expectations appearing in the algorithm are discretized with m = 500 Monte Carlo samples. To not waste work in the early iterations, we use afﬁne maps of rank r = 4 for iterations ℓ= 1, . . . , 5. Then we switch to polynomial maps of degree 2 and rank r = 2 for the remaining iterations.