reproducibilityindex.ai

Cross-Domain Imitation Learning via Optimal Transport

Authors: Arnaud Fickinger, Samuel Cohen, Stuart Russell, Brandon Amos

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate the effectiveness of GWIL in non-trivial continuous control domains ranging from simple rigid transformation of the expert domain to arbitrary transformation of the state-action space. Our experiments show that GWIL learns optimal behaviors with a single demonstration from another domain without any proxy tasks in non-trivial continuous control settings.
Researcher Affiliation	Collaboration	Arnaud Fickinger13 Samuel Cohen23 Stuart Russell1 Brandon Amos3 1Berkeley AI Research 2University College London 3Facebook AI
Pseudocode	Yes	Algorithm 1 Gromov-Wasserstein imitation learning from a single expert demonstration.
Open Source Code	Yes	1Project site with videos and code: https://arnaudfickinger.github.io/gwil/
Open Datasets	Yes	To answer these three questions, we use simulated continuous control tasks implemented in Mujoco (Todorov et al., 2012) and the Deep Mind control suite (Tassa et al., 2018). We evaluate the capacity of IL methods to transfer to rigid transformation of the expert domain by using the Point Mass Maze environment from Hejna et al. (2020).
Dataset Splits	No	The paper does not provide specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) for training, validation, and testing.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions software like "Mujoco", "Deep Mind control suite", and "soft actor-critic algorithm" but does not provide specific version numbers for these components.
Experiment Setup	No	The paper does not contain specific experimental setup details such as concrete hyperparameter values (e.g., learning rates, batch sizes, number of epochs) or detailed training configurations in the main text.