Learning Robust Dynamics through Variational Sparse Gating
Authors: Arnav Kumar Jain, Shivakanth Sujit, Shruti Joshi, Vincent Michalski, Danijar Hafner, Samira Ebrahimi Kahou
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the two model architectures in the Bring Back Shapes (BBS) environment that features a large number of moving objects and partial observability, demonstrating clear improvements over prior models. |
| Researcher Affiliation | Collaboration | 1Université de Montréal, 2Mila Quebec AI Institute, 3École de technologie supérieure, 4University of Toronto, 5Google Brain, 6CIFAR. |
| Pseudocode | No | The paper provides architectural diagrams (Figure 2) and mathematical equations for the models, but it does not include a block explicitly labeled "Pseudocode" or "Algorithm". |
| Open Source Code | Yes | Code is available at: https://github.com/arnavkj1995/VSG. |
| Open Datasets | Yes | We developed a new partially-observable and stochastic environment, called Bring Back Shapes (BBS)... Lastly, the proposed methods were also evaluated on existing benchmarks Deep Mind Control (DMC) (Tassa et al., 2018), DMC with Natural Background (Zhang et al., 2021; Nguyen et al., 2021b), and Atari (Bellemare et al., 2013). |
| Dataset Splits | No | The paper describes training procedures within reinforcement learning environments and evaluates performance at specific timesteps (e.g., 1M and 2.5M steps) and across multiple seeds, but it does not provide explicit training/validation/test dataset splits in the conventional sense (e.g., specific percentages or sample counts for a static dataset). |
| Hardware Specification | Yes | The model was implemented using Tensorflow Probabability (Dillon et al., 2017) and trained on a single NVIDIA V100 GPU with 16GB memory. |
| Software Dependencies | No | The paper mentions using "Tensorflow Probabability" but does not specify its version number or any other software dependencies with specific version numbers. |
| Experiment Setup | Yes | Action is a 2-dimensional continuous vector with acceleration and direction as components. Episodes last for 3000 environment steps and an action repeat (Mnih et al., 2016) of 4 was used. Baseline agents include Dreamer V2 (Hafner et al., 2021) and Dr Q-v2 (Yarats et al., 2022). In Appendix C, we mention the hyperparameters for the proposed methods VSG and SVSG. Training time for Dreamer V2, VSG and SVSG methods on the BBS environment for 2.5M environment steps are around 12, 11 and 10.5 hours, respectively. Lastly, results are reported across 5 seeds. |