Learning Robust Dynamics through Variational Sparse Gating

Authors: Arnav Kumar Jain, Shivakanth Sujit, Shruti Joshi, Vincent Michalski, Danijar Hafner, Samira Ebrahimi Kahou

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate the two model architectures in the Bring Back Shapes (BBS) environment that features a large number of moving objects and partial observability, demonstrating clear improvements over prior models.
Researcher Affiliation Collaboration 1Université de Montréal, 2Mila Quebec AI Institute, 3École de technologie supérieure, 4University of Toronto, 5Google Brain, 6CIFAR.
Pseudocode No The paper provides architectural diagrams (Figure 2) and mathematical equations for the models, but it does not include a block explicitly labeled "Pseudocode" or "Algorithm".
Open Source Code Yes Code is available at: https://github.com/arnavkj1995/VSG.
Open Datasets Yes We developed a new partially-observable and stochastic environment, called Bring Back Shapes (BBS)... Lastly, the proposed methods were also evaluated on existing benchmarks Deep Mind Control (DMC) (Tassa et al., 2018), DMC with Natural Background (Zhang et al., 2021; Nguyen et al., 2021b), and Atari (Bellemare et al., 2013).
Dataset Splits No The paper describes training procedures within reinforcement learning environments and evaluates performance at specific timesteps (e.g., 1M and 2.5M steps) and across multiple seeds, but it does not provide explicit training/validation/test dataset splits in the conventional sense (e.g., specific percentages or sample counts for a static dataset).
Hardware Specification Yes The model was implemented using Tensorflow Probabability (Dillon et al., 2017) and trained on a single NVIDIA V100 GPU with 16GB memory.
Software Dependencies No The paper mentions using "Tensorflow Probabability" but does not specify its version number or any other software dependencies with specific version numbers.
Experiment Setup Yes Action is a 2-dimensional continuous vector with acceleration and direction as components. Episodes last for 3000 environment steps and an action repeat (Mnih et al., 2016) of 4 was used. Baseline agents include Dreamer V2 (Hafner et al., 2021) and Dr Q-v2 (Yarats et al., 2022). In Appendix C, we mention the hyperparameters for the proposed methods VSG and SVSG. Training time for Dreamer V2, VSG and SVSG methods on the BBS environment for 2.5M environment steps are around 12, 11 and 10.5 hours, respectively. Lastly, results are reported across 5 seeds.