reproducibilityindex.ai

Mitigating Covariate Shift in Imitation Learning via Offline Data With Partial Coverage

Authors: Jonathan Chang, Masatoshi Uehara, Dhruv Sreenivas, Rahul Kidambi, Wen Sun

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Complementing our theory results, we also demonstrate that a practical implementation of our approach mitigates covariate shift on benchmark Mu Jo Co continuous control tasks. We demonstrate that with behavior policies whose performances are less than half of that of the expert, MILO still successfully imitates with an extremely low number of expert state-action pairs while traditional ofﬂine IL methods such as behavior cloning (BC) fail completely. Source code is provided at https://github.com/jdchang1/milo.
Researcher Affiliation	Collaboration	Jonathan D. Chang Department of Computer Science Cornell University jdc396@cornell.edu Masatoshi Uehara Department of Computer Science Cornell University mu223@cornell.edu Dhruv Sreenivas Department of Computer Science Cornell University ds844@cornell.edu Rahul Kidambi Amazon Search & AI rk773@cornell.edu Wen Sun Department of Computer Science Cornell University ws455@cornell.edu
Pseudocode	Yes	Algorithm 1 Framework for model-based Imitation Learning with ofﬂine data (MILO) and Algorithm 2 A practical instantiation of MILO
Open Source Code	Yes	Source code is provided at https://github.com/jdchang1/milo.
Open Datasets	Yes	We evaluate MILO on ﬁve environments from Open AI Gym [11] simulated with Mu Jo Co [67]: Hopper-v2, Walker2d-v2, Half Cheetah-v2, Ant-v2, and Humanoid-v2.
Dataset Splits	No	The paper mentions collecting 'expert dataset' and 'offline static dataset' but does not specify explicit training, validation, and test splits with percentages or sample counts for these datasets in the conventional sense of supervised learning. The evaluation is done by running policies in the simulation environments.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU/CPU models, memory, or cloud computing instances.
Software Dependencies	No	The paper mentions using Open AI Gym [11] and Mu Jo Co [67] environments and neural networks, but it does not provide specific version numbers for these or any other software libraries, frameworks, or dependencies used in the experiments.
Experiment Setup	Yes	See appendix for details on hyperparameters, environments, and dataset composition.