Covariance-Controlled Adaptive Langevin Thermostat for Large-Scale Bayesian Sampling
Authors: Xiaocheng Shang, Zhanxing Zhu, Benedict Leimkuhler, Amos J. Storkey
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Various numerical experiments are performed in Section 4 to verify the usefulness of CCAd L in a wide range of large-scale machine learning applications. |
| Researcher Affiliation | Academia | Xiaocheng Shang University of Edinburgh x.shang@ed.ac.uk, Zhanxing Zhu University of Edinburgh zhanxing.zhu@ed.ac.uk, Benedict Leimkuhler University of Edinburgh b.leimkuhler@ed.ac.uk, Amos J. Storkey University of Edinburgh a.storkey@ed.ac.uk |
| Pseudocode | Yes | Algorithm 1 Covariance-Controlled Adaptive Langevin (CCAd L) |
| Open Source Code | No | The paper does not contain an explicit statement about releasing its source code or provide any links to a code repository. |
| Open Datasets | Yes | We then consider a Bayesian logistic regression model trained on the benchmark MNIST dataset for binary classification of digits 7 and 9 using 12, 214 training data points, with a test set of size 2037. ... We trained a DRBM on different large-scale multi-class datasets from LIBSVM1 dataset collection, including connect-4, letter, and Sens IT Vehicle acoustic. The detailed information of these datasets are presented in Table 2. 1http://www.csie.ntu.edu.tw/ cjlin/libsvmtools/datasets/multiclass.html |
| Dataset Splits | No | The paper specifies training and test set sizes (e.g., '12, 214 training data points, with a test set of size 2037' for MNIST, and 'training/test set' columns in Table 2 for DRBM datasets), but does not explicitly mention a validation set or how data was split for validation. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory, or cloud instance types) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details, such as library names with version numbers (e.g., Python, PyTorch, TensorFlow versions), that would be necessary to replicate the experiments. |
| Experiment Setup | Yes | We apply the same experimental setting as in [5]. We generated N = 100 samples from the standard normal distribution N(0, 1). We used the likelihood function of N(xi|µ, γ 1) and assigned Normal-Gamma distribution as their prior distribution, i.e. µ, γ N(µ|0, γ)Gam(γ|1, 1). ... A subset of size n = 500 was used at each timestep. ... The size of the subset was chosen as 500 1000 to obtain a reasonable variance estimation. For each dataset, we chose the first 20% of the total number of passes over the entire dataset as the burn-in period, and collected the remaining samples for prediction. |