Meta-Learning For Stochastic Gradient MCMC
Authors: Wenbo Gong, Yingzhen Li, José Miguel Hernández-Lobato
ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments validate the proposed approach on learning tasks with Bayesian fully connected neural networks, Bayesian convolutional neural networks and Bayesian recurrent neural networks, showing that the learned sampler out-performs generic, hand-designed SG-MCMC algorithms, and generalizes to different datasets and larger architectures. |
| Researcher Affiliation | Collaboration | 1University of Cambridge 2Microsoft Research Cambridge |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available at https://github.com/Wenbo Gong/Meta SGMCMC. |
| Open Datasets | Yes | Next, we consider Bayesian neural network classification on MNIST data... on convolutional neural networks (CNNs) for CIFAR-10 (Krizhevsky, 2009) classification... |
| Dataset Splits | Yes | We split the 50,000 training images into 45,000 training and 5,000 validation images, and tune the discretization step-size of each sampling and optimization methods on the validation set for 80 epochs. |
| Hardware Specification | No | The paper mentions 'Parallel computation with GPUs improves real-time speed' but does not provide specific hardware details such as GPU models, CPU types, or memory specifications used for the experiments. |
| Software Dependencies | No | The paper mentions 'TensorFlow' with a citation from 2015, but does not provide specific version numbers for TensorFlow or any other software libraries or dependencies used in the experiments. |
| Experiment Setup | Yes | The paper provides extensive details on the experimental setup, including hyperparameters such as step sizes, initialization methods, number of epochs, number of steps per epoch, and optimizer learning rates. For example, in Section F.1: 'The initial positions are drawn from Uniform([0, 6]D). We train our sampler for 100 epochs and each epochs consists 4 x 100 steps. For every 100 steps, we updates the Q and D matrices using Adam optimizer with learning rate 0.0005.' |