Visual Adversarial Imitation Learning using Variational Models
Authors: Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experiments involving several vision-based locomotion and manipulation tasks, we find that V-MAIL learns successful visuomotor policies in a sample-efficient manner, has better stability compared to prior work, and also achieves higher asymptotic performance. |
| Researcher Affiliation | Collaboration | Rafael Rafailov1 Tianhe Yu1 Aravind Rajeswaran2,3 Chelsea Finn1 {rafailov, tianheyu, cbfinn}@stanford.edu, aravraj@fb.com 1 Stanford University, 2 University of Washington, 3 Facebook AI Research |
| Pseudocode | Yes | Algorithm 1 V-MAIL: Variational Model-Based Adversarial Imitation Learning |
| Open Source Code | No | All results including videos can be found online at https://sites.google.com/view/variational-mail. |
| Open Datasets | Yes | These consist of two locomotion environments from the Deep Mind Control Suite [30], the classic Car Racing environment from Open AI Gym [31] and two dexterous manipulation tasks using the D Claw [32] and Shadow Hand platforms. |
| Dataset Splits | No | The agent is provided with a fixed set of expert demonstrations collected by executing an expert policy πE, which we assume is optimal under the unknown reward function. |
| Hardware Specification | Yes | All experiments were carried out on a single Titan RTX GPU using an internal cluster for about 1000 GPU hours. |
| Software Dependencies | No | For the former, we choose DAC [18] as a representative approach, which we equip with Dr Q data augmentation for greater performance on vision-based tasks. |
| Experiment Setup | No | For implementation details, see the appendix. |