Model-Based Visual Planning with Self-Supervised Functional Distances

Authors: Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot. In comparisons, we find that this approach substantially outperforms both model-free and model-based prior methods. Videos and visualizations are available here: https://sites.google.com/berkeley.edu/mbold.
Researcher Affiliation Academia 1University of California, Berkeley 2Stanford University 3Carnegie Mellon University
Pseudocode No The paper describes the Model Predictive Control (MPC) algorithm and its components (like CEM) in text, but it does not provide a formally labeled “Pseudocode” or “Algorithm” block with structured steps.
Open Source Code No The paper states: “Videos and visualizations are available here: https://sites.google.com/berkeley.edu/mbold.” and “Videos of both simulated and real-world task execution can be found at the project website: https://sites.google.com/berkeley.edu/mbold.” These links are for videos and visualizations/project website, not explicit code repositories.
Open Datasets Yes The Sawyer environments are adapted from the Meta-World benchmark (Yu et al., 2019a), and the door sliding environment is based off of the environment presented by Lynch et al. (2020). ... We additionally evaluate MBOLD in a real-world drawer manipulation task using a 7-Do F Franka arm. We train the dynamics model and distance function on a preexisting dataset of 1000 trajectories collected by a weakly supervised batch exploration algorithm in prior work (Chen et al., 2020).
Dataset Splits Yes Table 2: Hyperparameters for distance learning: Train/test/val split 0.9/0.05/0.05
Hardware Specification No This research used the Savio computational cluster resource provided by the Berkeley Research Computing program at the University of California, Berkeley.
Software Dependencies No We build off of the open source implementation of Dreamer by the original authors, written in Tensor Flow2... We use the open-source implementation of soft actor-critic (SAC) in RLKit... We implement the k-NN search using the GPU-enabled FAISS library (Johnson et al., 2017).
Experiment Setup Yes Table 2: Hyperparameters for distance learning. Table 3: Hyperparameters for model-based planning. ... Additional training hyperparameters are detailed in Table 2. In Table 3, we describe the parameters for model-based planning in our experiments. ... The particular choice of model is a design decision when implementing our method. In our implementation, we use a convolutional video prediction model adapted from SAVP (Lee et al., 2018).