Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Sparse Graphical Memory for Robust Planning
Authors: Scott Emmons, Ajay Jain, Misha Laskin, Thanard Kurutach, Pieter Abbeel, Deepak Pathak
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimentally, we show that SGM significantly outperforms current state of the art methods on long horizon, sparse-reward visual navigation tasks. Project video and code are available at https://mishalaskin.github.io/sgm/. We evaluate SGM under two high-level learning frameworks: reinforcement learning (RL), and self-supervised learning (SSL). |
| Researcher Affiliation | Academia | Scott Emmons* Berkeley AI Research Ajay Jain* Berkeley AI Research Michael Laskin* Berkeley AI Research Thanard Kurutach Berkeley AI Research Pieter Abbeel Berkeley AI Research Deepak Pathak Carnegie Mellon University *Equal contribution. Author order determined randomly. EMAIL 34th Conference on Neural Information Processing Systems (Neur IPS 2020), Vancouver, Canada. |
| Pseudocode | Yes | Algorithm 1 Build Sparse Graph |
| Open Source Code | Yes | Project video and code are available at https://mishalaskin.github.io/sgm/. |
| Open Datasets | Yes | Point Env[9] continuous control of a point-mass in a maze used in So RB. Observations and goals are positional (x, y) coordinates. Vi ZDoom[49] discrete control of an agent in a visual maze environment used in SPTM. Observations and goals are images. Safety Gym[38] continuous control of an agent in a visual maze environment. Observations and goals are images, though odometry data is available for observations but not for goals. |
| Dataset Splits | No | The paper mentions environments used for testing but does not specify explicit training, validation, and test dataset splits in terms of percentages or counts required for reproduction. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or memory specifications). |
| Software Dependencies | No | The paper mentions Python, PyTorch, and CUDA in Appendix C, but it does not specify version numbers for these software components, which is required for reproducibility. |
| Experiment Setup | Yes | All networks are trained with the Adam optimizer [23, 24] with a learning rate of 1e-4 and batch size of 256. The replay buffer is size 1e6 transitions. We use uniform random actions for the first 1M steps for exploration, then switch to an epsilon-greedy strategy with a linear schedule from 1.0 to 0.1 over 1M steps. |