Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling

Authors: Xiaocheng Shang, Zhanxing Zhu, Benedict Leimkuhler, Amos J. Storkey

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Various numerical experiments are performed in Section 4 to verify the usefulness of CCAd L in a wide range of large-scale machine learning applications.
Researcher Affiliation Academia Xiaocheng Shang University of Edinburgh x.shang@ed.ac.uk, Zhanxing Zhu University of Edinburgh zhanxing.zhu@ed.ac.uk, Benedict Leimkuhler University of Edinburgh b.leimkuhler@ed.ac.uk, Amos J. Storkey University of Edinburgh a.storkey@ed.ac.uk
Pseudocode Yes Algorithm 1 Covariance-Controlled Adaptive Langevin (CCAd L)
Open Source Code No The paper does not contain an explicit statement about releasing its source code or provide any links to a code repository.
Open Datasets Yes We then consider a Bayesian logistic regression model trained on the benchmark MNIST dataset for binary classification of digits 7 and 9 using 12, 214 training data points, with a test set of size 2037. ... We trained a DRBM on different large-scale multi-class datasets from LIBSVM1 dataset collection, including connect-4, letter, and Sens IT Vehicle acoustic. The detailed information of these datasets are presented in Table 2. 1http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/multiclass.html
Dataset Splits No The paper specifies training and test set sizes (e.g., '12, 214 training data points, with a test set of size 2037' for MNIST, and 'training/test set' columns in Table 2 for DRBM datasets), but does not explicitly mention a validation set or how data was split for validation.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments.
Software Dependencies No The paper does not provide specific software dependency details, such as library names with version numbers (e.g., Python, PyTorch, TensorFlow versions), that would be necessary to replicate the experiments.
Experiment Setup Yes We apply the same experimental setting as in [5]. We generated N = 100 samples from the standard normal distribution N(0, 1). We used the likelihood function of N(xi|µ, γ 1) and assigned Normal-Gamma distribution as their prior distribution, i.e. µ, γ N(µ|0, γ)Gam(γ|1, 1). ... A subset of size n = 500 was used at each timestep. ... The size of the subset was chosen as 500 1000 to obtain a reasonable variance estimation. For each dataset, we chose the first 20% of the total number of passes over the entire dataset as the burn-in period, and collected the remaining samples for prediction.