Constrained Exploration via Reflected Replica Exchange Stochastic Gradient Langevin Dynamics

Authors: Haoyang Zheng, Hengrong Du, Qi Feng, Wei Deng, Guang Lin

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the r2SGLD algorithm*, we begin by applying it to dynamical system identification (Section 4.1), where the physical constraints are inherent in the dynamical system. Subsequently, its effectiveness in multi-mode distribution simulation is detailed in Section 4.2. Lastly, Section 4.3 illustrates the algorithm s performance in deep learning tasks.
Researcher Affiliation Collaboration 1Purdue University, West Lafayette, IN 2Vanderbilt University, Nashville, TN 3Florida State University, Tallahassee, FL 4Machine Learning Research, Morgan Stanley, New York, NY.
Pseudocode Yes Algorithm 1 The r2SGLD Algorithm. ... Algorithm 2 Reflected Replica Exchange Stochastic Gradient Langevin Dynamics with the DEO scheme.
Open Source Code Yes Code is available at github.com/haoyangzheng1996/r2SGLD
Open Datasets Yes We further extend the testing to CIFAR100 benchmarks, which utilize 20 and 56-layer residual networks (Res Net20 and Res Net56, respectively) for training and testing. ... The initial learning rate for CIFAR 100 is scaled to 2e-4 when accounting for the training data size of 50,000.
Dataset Splits Yes The initial learning rate for CIFAR 100 is scaled to 2e-4 when accounting for the training data size of 50,000.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, processor types, or memory amounts used for running the experiments. It only discusses algorithms and software settings.
Software Dependencies No The paper mentions various algorithms and models (e.g., SGLD, HMC, ResNet), and refers to a Python package for PySINDy, but does not list specific software dependencies with version numbers (e.g., PyTorch 1.9, CUDA 11.1).
Experiment Setup Yes For SGLD and R-SGLD, the learning rate commences at 5e-6, decaying at a rate of 0.9999 per iteration after the first 10,000 iterations. ...The batch size is 2,048 across all methods. We repeat experiments for each algorithm ten times to record the mean and two standard deviations for metrics...