Faster Sampling via Stochastic Gradient Proximal Sampler

Authors: Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments to compare SGLD with SPS-SGLD, where the latter one is implemented by using SGLD to sample p (x|y, b) in the stochastic proximal sampler framework. Empirical results show that SPS-SGLD consistently achieves better sampling performance than vanilla SGLD for various problem dimensions.
Researcher Affiliation Collaboration Xunpeng Huang 1 Difan Zou 2 Hanze Dong 3 Yi-An Ma 4 Tong Zhang 5 1The Hong Kong University of Science and Technology 2The University of Hong Kong 3Salesforce AI Research 4University of California, San Diego 5University of Illinois Urbana-Champaign.
Pseudocode Yes Algorithm 1 Stochastic Proximal Sampler; Algorithm 2 Inner Stochastic Gradient Langevin Dynamics: Inner SGLD(x0, b, η, δ); Algorithm 3 Inner Metropolis-adjusted Langevin algorithm: Inner MALA(x0, b, η, δ).
Open Source Code No The paper does not provide an explicit statement or link for open-source code for the described methodology.
Open Datasets Yes Here, we consider the component e fi shares a similar definition in Zou et al. (2019), i.e., e fi(x) := e x b µi 2/2 + e x b+µi 2/2, where the number of input data n = 100, the dimension d {10, 20, 30, 40, 50}, the bias vector b = (3, 3, . . . , 3) , and the data input p d/10 µi N(µ, Id d) with µ = (2, 2, . . . , 2).
Dataset Splits No The paper does not describe explicit training, validation, or test dataset splits. It uses synthetic data where the target distribution is known, and algorithms sample from this distribution directly for evaluation of metrics like Total Variation (TV) distance.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment.
Experiment Setup Yes For the common hyper-parameter settings of SGLD and SPS-SGLD, we fix the number of stochastic gradient oracles as 12000 and the mini-batch size for each iteration as 1. We enumerate the step size of SGLD and the inner step size of SPS-SGLD from 0.2 to 1.4. Besides, the inner loops iterations and the outer loops step sizes are grid-searched with [20, 40, 80] and [1.0, 4.0, 10.0]. Table 2. Hyper-parameter settings for different dimension tasks based on the grid search.