Planning with Goal-Conditioned Policies

Authors: Soroush Nasiriany, Vitchyr Pong, Steven Lin, Sergey Levine

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We compare our method with planning-based and model-free methods and find that our method significantly outperforms prior work when evaluated on image-based robot navigation and manipulation tasks that require non-greedy, multi-staged behavior.
Researcher Affiliation Academia University of California, Berkeley {snasiriany,vitchyr,stevenlin598,svlevine@berkeley.edu}
Pseudocode Yes Algorithm 1 Latent Embeddings for Abstracted Planning (LEAP)
Open Source Code Yes Videos of the final policies and generated subgoals and code for our implementation of LEAP are available on the paper website3. (Footnote 3: https://sites.google.com/view/goal-planning)
Open Datasets No The paper mentions tasks like '2D Navigation', 'Push and Reach', and 'Ant Navigation' which appear to be custom environments. It does not provide concrete access information (links, DOIs, formal citations) to publicly available datasets used for training.
Dataset Splits No The paper states 'We train all methods on randomly initialized goals and initial states' but does not specify explicit training, validation, or test dataset splits (e.g., percentages or counts) or refer to standard predefined splits.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU models, CPU types, memory) used for running experiments.
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes All of our tasks use Tmax = 100, and LEAP uses CEM to optimize over K = 3 subgoals, each of which are 25 time steps apart. ... This task has a significantly longer horizon of Tmax = 600, and LEAP uses CEM to optimize over K = 11 subgoals, each of which are 50 time steps apart.