Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics
Authors: Haoyang Zheng, Hengrong Du, Qi Feng, Wei Deng, Guang Lin
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate the r2SGLD algorithm*, we begin by applying it to dynamical system identification (Section 4.1), where the physical constraints are inherent in the dynamical system. Subsequently, its effectiveness in multi-mode distribution simulation is detailed in Section 4.2. Lastly, Section 4.3 illustrates the algorithm s performance in deep learning tasks. |
| Researcher Affiliation | Collaboration | 1Purdue University, West Lafayette, IN 2Vanderbilt University, Nashville, TN 3Florida State University, Tallahassee, FL 4Machine Learning Research, Morgan Stanley, New York, NY. |
| Pseudocode | Yes | Algorithm 1 The r2SGLD Algorithm. ... Algorithm 2 Reflected Replica Exchange Stochastic Gradient Langevin Dynamics with the DEO scheme. |
| Open Source Code | Yes | Code is available at github.com/haoyangzheng1996/r2SGLD |
| Open Datasets | Yes | We further extend the testing to CIFAR100 benchmarks, which utilize 20 and 56-layer residual networks (Res Net20 and Res Net56, respectively) for training and testing. ... The initial learning rate for CIFAR 100 is scaled to 2e-4 when accounting for the training data size of 50,000. |
| Dataset Splits | Yes | The initial learning rate for CIFAR 100 is scaled to 2e-4 when accounting for the training data size of 50,000. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, processor types, or memory amounts used for running the experiments. It only discusses algorithms and software settings. |
| Software Dependencies | No | The paper mentions various algorithms and models (e.g., SGLD, HMC, ResNet), and refers to a Python package for PySINDy, but does not list specific software dependencies with version numbers (e.g., PyTorch 1.9, CUDA 11.1). |
| Experiment Setup | Yes | For SGLD and R-SGLD, the learning rate commences at 5e-6, decaying at a rate of 0.9999 per iteration after the first 10,000 iterations. ...The batch size is 2,048 across all methods. We repeat experiments for each algorithm ten times to record the mean and two standard deviations for metrics... |