Stochastic Gradient Hamiltonian Monte Carlo Methods with Recursive Variance Reduction

Authors: Difan Zou, Pan Xu, Quanquan Gu

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Thorough experiments on synthetic and real-world datasets validate our theory and demonstrate the superiority of SRVR-HMC.
Researcher Affiliation Academia Difan Zou Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 knowzou@cs.ucla.edu Pan Xu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 panxu@cs.ucla.edu Quanquan Gu Department of Computer Science University of California, Los Angeles Los Angeles, CA 90095 qgu@cs.ucla.edu
Pseudocode Yes Algorithm 1 Stochastic Recursive Variance-Reduced gradient HMC (SRVR-HMC)
Open Source Code No The paper does not include any explicit statement about releasing the source code for the described methodology, nor does it provide a link to a code repository.
Open Datasets Yes We compare the performance of SRVR-HMC with all the baseline algorithms on MEG dataset4, which consists of 17730 time-points in 122 channels. 4http://research.ics.aalto.fi/ica/eegmeg/MEG_data.html
Dataset Splits No The paper mentions: 'we extract two subset with sizes n = 500 and n = 5000 from the original dataset for training, and regard the rest 12730 examples as test dataset.' This describes training and test sets but does not specify a separate validation set split.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models, memory, or cloud computing instances used for running the experiments.
Software Dependencies No The paper does not list any specific software dependencies, libraries, or solvers with their version numbers that are needed to replicate the experiments.
Experiment Setup Yes Specifically, we run SRVR-HMC for 10^4 data passes, and use the last 10^5 iterates to visualize the estimated distribution, where the batch size, minibatch size and epoch length are set to be B0 = n, B = 1 and L = n respectively. where the batch size, minibatch size and epoch length are set to be B0 = n/5, B = 10 and L = B0/B, and the rest hyper parameters are tuned to achieve the best performance.