reproducibilityindex.ai

Adversarial Option-Aware Hierarchical Imitation Learning

Authors: Mingxuan Jing, Wenbing Huang, Fuchun Sun, Xiaojian Ma, Tao Kong, Chuang Gan, Lei Li

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed method on several robotic locomotion and manipulation tasks against state-of-the-art HIL/IL baselines. The results demonstrate that our approach attains both dramatically faster convergence and better ﬁnal performance over the counterparts. A complete set of ablation studies also verify the validity of each component we proposed.
Researcher Affiliation	Collaboration	1Department of Computer Science and Technology, Tsinghua University, Beijing, China (Mingxuan Jing <jingmingxuan@outlook.com>; Wenbing Huang <hwenbing@126.com>; Fuchun Sun <fcsun@tsinghua.edu.cn>) 2THU-Bosch JCML center 3University of California, Los Angeles, USA 4Bytedance AI Lab, Beijing, China 5MIT-IBM Watson AI Lab, USA.
Pseudocode	Yes	Algorithm 1 Option-GAIL
Open Source Code	No	The paper does not include an explicit statement or link for the release of its own source code.
Open Datasets	Yes	Hopper-v2 and Walker2d-v2: The Hopper-v2 and the Walker2d-v2 are two standardized continuous-time locomotion environments implemented in the Open AI Gym (Brockman et al., 2016) with the Mu Jo Co (Todorov et al., 2012) physics simulator. ... Ant Push-v0: ...proposed in Nachum et al. (2018), ... Close Microwave2: The Closemicrowave2 is a more challenging robot operation environment in RLBench (James et al., 2020).
Dataset Splits	No	The paper mentions using expert demonstrations for 'learning' but does not specify a validation set or how the demonstrations are split for training and validation purposes of the proposed model.
Hardware Specification	No	The paper does not specify any hardware details like GPU/CPU models or other computing infrastructure used for the experiments.
Software Dependencies	No	The paper references various software and environments like Open AI Gym (Brockman et al., 2016), Mu Jo Co (Todorov et al., 2012), PPO (Schulman et al., 2017), DAC (Zhang & Whiteson, 2019), and RLBench (James et al., 2020), but it does not specify version numbers for these or other software dependencies.
Experiment Setup	Yes	Speciﬁcally, we allow 4 available option classes for all environments, a Multi-Layer Perception(MLP) with hidden size (64, 64) to implement the policies of both levels on Hopper-v2, Walker2d-v2, Ant Push-v0, and (128, 128) on Closemicrowave2; the discriminator is realized by an MLP with hidden size (256, 256) on all environments.