Visual Adversarial Imitation Learning using Variational Models

Authors: Rafael Rafailov, Tianhe Yu, Aravind Rajeswaran, Chelsea Finn

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through experiments involving several vision-based locomotion and manipulation tasks, we find that V-MAIL learns successful visuomotor policies in a sample-efficient manner, has better stability compared to prior work, and also achieves higher asymptotic performance.
Researcher Affiliation Collaboration Rafael Rafailov1 Tianhe Yu1 Aravind Rajeswaran2,3 Chelsea Finn1 {rafailov, tianheyu, cbfinn}@stanford.edu, aravraj@fb.com 1 Stanford University, 2 University of Washington, 3 Facebook AI Research
Pseudocode Yes Algorithm 1 V-MAIL: Variational Model-Based Adversarial Imitation Learning
Open Source Code No All results including videos can be found online at https://sites.google.com/view/variational-mail.
Open Datasets Yes These consist of two locomotion environments from the Deep Mind Control Suite [30], the classic Car Racing environment from Open AI Gym [31] and two dexterous manipulation tasks using the D Claw [32] and Shadow Hand platforms.
Dataset Splits No The agent is provided with a fixed set of expert demonstrations collected by executing an expert policy πE, which we assume is optimal under the unknown reward function.
Hardware Specification Yes All experiments were carried out on a single Titan RTX GPU using an internal cluster for about 1000 GPU hours.
Software Dependencies No For the former, we choose DAC [18] as a representative approach, which we equip with Dr Q data augmentation for greater performance on vision-based tasks.
Experiment Setup No For implementation details, see the appendix.