Projected Stein Variational Gradient Descent

Authors: Peng Chen, Omar Ghattas

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present three Bayesian inference problems with high-dimensional parameters to demonstrate the accuracy of p SVGD compared to SVGD, and the convergence and scalability of p SVGD w.r.t. the number of parameters, samples, data points, and processor cores.
Researcher Affiliation Academia Peng Chen Omar Ghattas Oden Institute for Computational Engineering and Sciences The University of Texas at Austin Austin, TX 78712. {peng, omar}@oden.utexas.edu
Pseudocode Yes Algorithm 1 p SVGD in parallel
Open Source Code Yes In the numerical experiments for structured models, we use a backtracking line search method with Armijo Goldstein condition to look for the step size ϵl, where the line search objective function is taken as the negative log-posterior function, see more details in the accompanying code at https://github.com/cpempire/pSVGD.
Open Datasets Yes Bayesian logistic regression for binary classification of cancer and normal patterns for mass-spectrometric data with 10, 000 attributes from https://archive.ics.uci.edu/ml/datasets/Arcene
Dataset Splits No The paper states 'We use 100 data samples for training and 100 for testing.' but does not explicitly mention a separate validation split or its details.
Hardware Specification Yes The training time for p SVGD is 201 seconds compared to 477 seconds for SVGD in a Mac Book Pro Laptop (2019) with the processor of 2.4 GHz 8-Core Intel Core i9.
Software Dependencies No The paper mentions 'MPI' for parallelization and refers to 'accompanying code' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, specific libraries).
Experiment Setup Yes We use the Euler-Maruyama scheme with step size t = 0.01 for the discretization, which leads to dimension d = 100 for the discrete Brownian path x. We run SVGD and the adaptive p SVGD with line search for the learning rate, using N = 128 samples... In Section 4.2: 'with 32 samples'. In Section 4.3: 'using 256 samples and 200 iterations for different dimensions'.