Projected Stein Variational Gradient Descent
Authors: Peng Chen, Omar Ghattas
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present three Bayesian inference problems with high-dimensional parameters to demonstrate the accuracy of p SVGD compared to SVGD, and the convergence and scalability of p SVGD w.r.t. the number of parameters, samples, data points, and processor cores. |
| Researcher Affiliation | Academia | Peng Chen Omar Ghattas Oden Institute for Computational Engineering and Sciences The University of Texas at Austin Austin, TX 78712. {peng, omar}@oden.utexas.edu |
| Pseudocode | Yes | Algorithm 1 p SVGD in parallel |
| Open Source Code | Yes | In the numerical experiments for structured models, we use a backtracking line search method with Armijo Goldstein condition to look for the step size ϵl, where the line search objective function is taken as the negative log-posterior function, see more details in the accompanying code at https://github.com/cpempire/pSVGD. |
| Open Datasets | Yes | Bayesian logistic regression for binary classification of cancer and normal patterns for mass-spectrometric data with 10, 000 attributes from https://archive.ics.uci.edu/ml/datasets/Arcene |
| Dataset Splits | No | The paper states 'We use 100 data samples for training and 100 for testing.' but does not explicitly mention a separate validation split or its details. |
| Hardware Specification | Yes | The training time for p SVGD is 201 seconds compared to 477 seconds for SVGD in a Mac Book Pro Laptop (2019) with the processor of 2.4 GHz 8-Core Intel Core i9. |
| Software Dependencies | No | The paper mentions 'MPI' for parallelization and refers to 'accompanying code' but does not specify software dependencies with version numbers (e.g., Python, PyTorch, specific libraries). |
| Experiment Setup | Yes | We use the Euler-Maruyama scheme with step size t = 0.01 for the discretization, which leads to dimension d = 100 for the discrete Brownian path x. We run SVGD and the adaptive p SVGD with line search for the learning rate, using N = 128 samples... In Section 4.2: 'with 32 samples'. In Section 4.3: 'using 256 samples and 200 iterations for different dimensions'. |