Cross-Domain Imitation Learning via Optimal Transport

Authors: Arnaud Fickinger, Samuel Cohen, Stuart Russell, Brandon Amos

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of GWIL in non-trivial continuous control domains ranging from simple rigid transformation of the expert domain to arbitrary transformation of the state-action space. Our experiments show that GWIL learns optimal behaviors with a single demonstration from another domain without any proxy tasks in non-trivial continuous control settings.
Researcher Affiliation Collaboration Arnaud Fickinger13 Samuel Cohen23 Stuart Russell1 Brandon Amos3 1Berkeley AI Research 2University College London 3Facebook AI
Pseudocode Yes Algorithm 1 Gromov-Wasserstein imitation learning from a single expert demonstration.
Open Source Code Yes 1Project site with videos and code: https://arnaudfickinger.github.io/gwil/
Open Datasets Yes To answer these three questions, we use simulated continuous control tasks implemented in Mujoco (Todorov et al., 2012) and the Deep Mind control suite (Tassa et al., 2018). We evaluate the capacity of IL methods to transfer to rigid transformation of the expert domain by using the Point Mass Maze environment from Hejna et al. (2020).
Dataset Splits No The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, and testing.
Hardware Specification No The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies No The paper mentions software like "Mujoco", "Deep Mind control suite", and "soft actor-critic algorithm" but does not provide specific version numbers for these components.
Experiment Setup No The paper does not contain specific experimental setup details such as concrete hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or detailed training configurations in the main text.