Model-Based Visual Planning with Self-Supervised Functional Distances
Authors: Stephen Tian, Suraj Nair, Frederik Ebert, Sudeep Dasari, Benjamin Eysenbach, Chelsea Finn, Sergey Levine
ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments, we find that our method can successfully learn models that perform a variety of tasks at test-time, moving objects amid distractors with a simulated robotic arm and even learning to open and close a drawer using a real-world robot. In comparisons, we find that this approach substantially outperforms both model-free and model-based prior methods. Videos and visualizations are available here: https://sites.google.com/berkeley.edu/mbold. |
| Researcher Affiliation | Academia | 1University of California, Berkeley 2Stanford University 3Carnegie Mellon University |
| Pseudocode | No | The paper describes the Model Predictive Control (MPC) algorithm and its components (like CEM) in text, but it does not provide a formally labeled “Pseudocode” or “Algorithm” block with structured steps. |
| Open Source Code | No | The paper states: “Videos and visualizations are available here: https://sites.google.com/berkeley.edu/mbold.” and “Videos of both simulated and real-world task execution can be found at the project website: https://sites.google.com/berkeley.edu/mbold.” These links are for videos and visualizations/project website, not explicit code repositories. |
| Open Datasets | Yes | The Sawyer environments are adapted from the Meta-World benchmark (Yu et al., 2019a), and the door sliding environment is based off of the environment presented by Lynch et al. (2020). ... We additionally evaluate MBOLD in a real-world drawer manipulation task using a 7-Do F Franka arm. We train the dynamics model and distance function on a preexisting dataset of 1000 trajectories collected by a weakly supervised batch exploration algorithm in prior work (Chen et al., 2020). |
| Dataset Splits | Yes | Table 2: Hyperparameters for distance learning: Train/test/val split 0.9/0.05/0.05 |
| Hardware Specification | No | This research used the Savio computational cluster resource provided by the Berkeley Research Computing program at the University of California, Berkeley. |
| Software Dependencies | No | We build off of the open source implementation of Dreamer by the original authors, written in Tensor Flow2... We use the open-source implementation of soft actor-critic (SAC) in RLKit... We implement the k-NN search using the GPU-enabled FAISS library (Johnson et al., 2017). |
| Experiment Setup | Yes | Table 2: Hyperparameters for distance learning. Table 3: Hyperparameters for model-based planning. ... Additional training hyperparameters are detailed in Table 2. In Table 3, we describe the parameters for model-based planning in our experiments. ... The particular choice of model is a design decision when implementing our method. In our implementation, we use a convolutional video prediction model adapted from SAVP (Lee et al., 2018). |