Faster Sampling via Stochastic Gradient Proximal Sampler
Authors: Xunpeng Huang, Difan Zou, Hanze Dong, Yian Ma, Tong Zhang
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments to compare SGLD with SPS-SGLD, where the latter one is implemented by using SGLD to sample p (x|y, b) in the stochastic proximal sampler framework. Empirical results show that SPS-SGLD consistently achieves better sampling performance than vanilla SGLD for various problem dimensions. |
| Researcher Affiliation | Collaboration | Xunpeng Huang 1 Difan Zou 2 Hanze Dong 3 Yi-An Ma 4 Tong Zhang 5 1The Hong Kong University of Science and Technology 2The University of Hong Kong 3Salesforce AI Research 4University of California, San Diego 5University of Illinois Urbana-Champaign. |
| Pseudocode | Yes | Algorithm 1 Stochastic Proximal Sampler; Algorithm 2 Inner Stochastic Gradient Langevin Dynamics: Inner SGLD(x0, b, η, δ); Algorithm 3 Inner Metropolis-adjusted Langevin algorithm: Inner MALA(x0, b, η, δ). |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | Here, we consider the component e fi shares a similar definition in Zou et al. (2019), i.e., e fi(x) := e x b µi 2/2 + e x b+µi 2/2, where the number of input data n = 100, the dimension d {10, 20, 30, 40, 50}, the bias vector b = (3, 3, . . . , 3) , and the data input p d/10 µi N(µ, Id d) with µ = (2, 2, . . . , 2). |
| Dataset Splits | No | The paper does not describe explicit training, validation, or test dataset splits. It uses synthetic data where the target distribution is known, and algorithms sample from this distribution directly for evaluation of metrics like Total Variation (TV) distance. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | For the common hyper-parameter settings of SGLD and SPS-SGLD, we fix the number of stochastic gradient oracles as 12000 and the mini-batch size for each iteration as 1. We enumerate the step size of SGLD and the inner step size of SPS-SGLD from 0.2 to 1.4. Besides, the inner loops iterations and the outer loops step sizes are grid-searched with [20, 40, 80] and [1.0, 4.0, 10.0]. Table 2. Hyper-parameter settings for different dimension tasks based on the grid search. |