Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks

Authors: Nanyang Ye, Zhanxing Zhu, Rafal Mantiuk

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We apply our method to training deep neural networks and demonstrate its effectiveness on CIFAR-10 and CIFAR-100 datasets, achieving comparable test error to AdamW and SGD with momentum.
Researcher Affiliation Academia Department of Applied Mathematics, University of Washington
Pseudocode Yes Algorithm 1: Langevin Dynamics with Continuous Tempering (LDCT)
Open Source Code No The paper does not provide an explicit statement or a link to open-source code for the described methodology.
Open Datasets Yes We train a ResNet-18 model from scratch on CIFAR-10 dataset using an AdamW optimizer. The CIFAR-10 dataset is a widely used public benchmark dataset.
Dataset Splits No The paper states the dataset is split into "50,000 training images and 10,000 test images" but does not explicitly mention a validation split or its size.
Hardware Specification Yes All experiments are implemented in PyTorch and run on a single NVIDIA A100 GPU.
Software Dependencies No The paper states that experiments are "implemented in PyTorch" but does not specify the version number for PyTorch or any other software dependencies.
Experiment Setup Yes We train a ResNet-18 model from scratch on CIFAR-10 dataset using an AdamW optimizer with a learning rate set to 0.001 and a batch size of 128. Models are trained for 200 epochs with a weight decay of 0.05 and no learning rate scheduler.