Langevin Dynamics with Continuous Tempering for Training Deep Neural Networks
Authors: Nanyang Ye, Zhanxing Zhu, Rafal Mantiuk
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We apply our method to training deep neural networks and demonstrate its effectiveness on CIFAR-10 and CIFAR-100 datasets, achieving comparable test error to AdamW and SGD with momentum. |
| Researcher Affiliation | Academia | Department of Applied Mathematics, University of Washington |
| Pseudocode | Yes | Algorithm 1: Langevin Dynamics with Continuous Tempering (LDCT) |
| Open Source Code | No | The paper does not provide an explicit statement or a link to open-source code for the described methodology. |
| Open Datasets | Yes | We train a ResNet-18 model from scratch on CIFAR-10 dataset using an AdamW optimizer. The CIFAR-10 dataset is a widely used public benchmark dataset. |
| Dataset Splits | No | The paper states the dataset is split into "50,000 training images and 10,000 test images" but does not explicitly mention a validation split or its size. |
| Hardware Specification | Yes | All experiments are implemented in PyTorch and run on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper states that experiments are "implemented in PyTorch" but does not specify the version number for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | We train a ResNet-18 model from scratch on CIFAR-10 dataset using an AdamW optimizer with a learning rate set to 0.001 and a batch size of 128. Models are trained for 200 epochs with a weight decay of 0.05 and no learning rate scheduler. |