reproducibilityindex.ai

Object-Aware Regularization for Addressing Causal Confusion in Imitation Learning

Authors: Jongjin Park, Younggyo Seo, Chang Liu, Li Zhao, Tao Qin, Jinwoo Shin, Tie-Yan Liu

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments demonstrate that OREO significantly improves the performance of behavioral cloning, outperforming various other regularization and causality-based methods on a variety of Atari environments and a self-driving CARLA environment.
Researcher Affiliation	Collaboration	Jongjin Park1 Younggyo Seo1 Chang Liu2 Li Zhao2 Tao Qin2 Jinwoo Shin1 Tie-Yan Liu2 1Korea Advanced Institute of Science and Technology 2Microsoft Research Asia
Pseudocode	Yes	see Figure 2 and Algorithm 1 for the overview and pseudocode of OREO, respectively.
Open Source Code	Yes	Our source code is available at https://github.com/microsoft/causal-imitation-learning.
Open Datasets	Yes	For expert demonstrations, we utilize DQN Replay dataset [1]. As this dataset consists of 50M transitions of each environment collected during the training of a DQN agent [32], we use the last N trajectories as expert demonstrations.
Dataset Splits	Yes	To see how this works in our setup, we ﬁrst introduce a validation dataset consisting of 5 expert demonstrations on confounded Pong environment... We evaluate the performance of OREO with a varying number of expert demonstrations N {5, 10, 20, 35, 50}.
Hardware Specification	Yes	We use a single Nvidia P100 GPU and 8 CPU cores for each training run.
Software Dependencies	No	The paper mentions using "Dopamine library [9]" but does not provide specific version numbers for it or other key software components like Python or deep learning frameworks.
Experiment Setup	Yes	As for hyperparameter selection, we use the default hyperparameters from previous or similar works [35, 50], i.e., a drop probability of p = 0.5, a codebook size of K = 512, and a commitment cost of β = 0.25.