BAM: Bayes with Adaptive Memory

Authors: Josue Nassar, Jennifer Rogers Brennan, Ben Evans, Kendall Lowrey

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To demonstrate the versatility of BAM, we apply it in a variety of scenarios. As BAM is a learning paradigm, it can be implemented as a module in a larger framework allowing it to be easily used in settings such as control/reinforcement learning and domain adaptation (Thompson, 1933; Osband et al., 2018; Lowrey et al., 2018; Yoon et al., 2018). BAM requires the ability to construct the posterior, p(θt|D<t, Wt), and evaluate the log marginal likelihood, log p(Dt|D<t, Wt).
Researcher Affiliation Academia Josue Nassar Department of Electrical and Computer Engineering Stony Brook University josue.nassar@stonybrook.edu Jennifer Brennan Department of Computer Science University of Washington jrb@cs.washington.edu Ben Evans Department of Computer Science New York University benevans@nyu.edu Kendall Lowrey Department of Computer Science University of Washington klowrey@cs.washington.edu
Pseudocode Yes Algorithm 1: Bottom-Up Greedy for BAM
Open Source Code No The paper does not provide any explicit statement or link regarding the public release of source code for BAM or its implementations.
Open Datasets Yes To achieve this, we create a rotated MNIST dataset.
Dataset Splits No The paper describes how the 'rotated MNIST dataset' was created and split into 'training set' (32 domains, 1875 samples each) and 'test set' (8 domains, 10 labeled examples given to find readout weights). However, it does not explicitly define a separate 'validation' dataset split for hyperparameter tuning or early stopping.
Hardware Specification No The paper does not specify the hardware used for running the experiments, such as particular CPU or GPU models.
Software Dependencies No The paper mentions various methods and packages used (e.g., 'Kalman filter', 'Model Predictive Path Integral control', 'scikit-learn'), but does not provide specific version numbers for any software dependencies.
Experiment Setup Yes For our controls experiments, we used Model Predictive Path Integral control (Williams et al., 2017), a model predictive control (MPC) algorithm with a planning horizon of 50 timesteps and 32 sample trajectories. Our sampling covariance was 0.4 for each controlled joint in the case of Cartpole, the action space is 1. The temperature parameter we used was 0.5.