Interacting Contour Stochastic Gradient Langevin Dynamics
Authors: Wei Deng, Siqi Liang, Botao Hao, Guang Lin, Faming Liang
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, we compare the proposed algorithm with popular benchmark methods for posterior sampling. The numerical results show a great potential of ICSGLD for large-scale uncertainty estimation tasks. |
| Researcher Affiliation | Collaboration | Wei Deng1, 2, Siqi Liang1, Botao Hao3, Guang Lin1, Faming Liang1 1Purdue University 2Morgan Stanley 3Deep Mind |
| Pseudocode | Yes | Algorithm 1 Interacting contour stochastic gradient Langevin dynamics algorithm (ICSGLD). |
| Open Source Code | Yes | Code is available at github.com/Wayne DW/Interacting-Contour-Stochastic-Gradient-Langevin-Dynamics. |
| Open Datasets | Yes | Our proposed algorithm achieves appealing mode explorations using a fixed learning rate on the MNIST dataset... based on the UCI Mushroom data set... on CIFAR100, and report the test accuracy (ACC) and test negative log-likelihood (NLL) based on 5 trials with standard error. For the out-of-distribution prediction performance, we test the well-trained models in Brier scores (Brier) * on the Street View House Numbers dataset (SVHN). |
| Dataset Splits | No | The paper mentions training data and test data but does not explicitly provide details about specific training/validation/test dataset splits (e.g., percentages or exact counts for a validation set) within the main text or supplementary material sections provided. |
| Hardware Specification | No | The paper mentions distributed computing but does not provide specific hardware details such as GPU models, CPU models, or cloud instance types used for experiments. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments (e.g., 'Python 3.8' or 'PyTorch 1.9'). |
| Experiment Setup | Yes | The learning rate is fixed to 1e-6 and the temperature is set to 0.1. ...batch size of 2500... fix ζ = 3e4 and weight decay 25. ...choose 100,000 partitions and u = 10. The step size follows ωk = min{0.01, 1 k0.6+100}. ...initial learning rate is 2e-6... choose m = 200 and u = 200 for Res Net20, 32, and 56 and u = 60 for WRN-16-8. |