Invariant Causal Imitation Learning for Generalizable Policies

Authors: Ioana Bica, Daniel Jarrett, Mihaela van der Schaar

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimentally, we compare our methods against several benchmarks in control and healthcare tasks and show its effectiveness in learning imitation policies capable of generalizing to unseen environments.We perform experiments on Open AI gym tasks [47] and on an ICU dataset from the MIMIC III database [48].
Researcher Affiliation Academia Ioana Bica University of Oxford, Oxford, UK The Alan Turing Institute, London, UK Daniel Jarrett University of Cambridge, Cambridge, UK Mihaela van der Schaar University of Cambridge, Cambridge, UK University of California, Los Angeles, USA The Alan Turing Institute, London, UK
Pseudocode Yes Further details and the full algorithm for optimizing ICIL can be found in Appendix C.
Open Source Code Yes The code for ICIL can be found at https://github.com/vanderschaarlab/mlforhealthlabpub and at https://github.com/ioanabica/Invariant-Causal-Imitation-Learning.
Open Datasets Yes We perform experiments on Open AI gym tasks [47] and on an ICU dataset from the MIMIC III database [48].
Dataset Splits No The paper describes training on two environments and testing on a third unseen environment, and varies the number of trajectories. However, it does not provide specific train/validation/test dataset splits (e.g., percentages or counts) or explicitly mention a validation set.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions software like "Open AI gym [47]", "RL Baselines Zoo [52]", and "Stable Open AI Baselines [53]" but does not provide specific version numbers for these or other dependencies, which is required for reproducibility.
Experiment Setup Yes Implementation details about all benchmarks and the hyperparameter settings used can be found in Appendix F.3.