Causal Imitation for Markov Decision Processes: a Partial Identification Approach
Authors: Kangrui Ruan, Junzhe Zhang, Xuan Di, Elias Bareinboim
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we validate the theoretical findings presented in Thm. 1 and illustrate the applications of the proposed CAIL algorithms (Alg. 1 and Alg. 2) on various causal imitation learning tasks. Such tasks range from synthetic causal models to real-world scenarios. |
| Researcher Affiliation | Academia | Kangrui Ruan Columbia University kr2910@columbia.edu Junzhe Zhang Syracuse University jzhan403@syr.edu Xuan Di Columbia University sharon.di@columbia.edu Elias Bareinboim Columbia University eb@cs.columbia.edu |
| Pseudocode | Yes | Algorithm 1: Causal GAIL with Confounded Reward R (CAIL-R) |
| Open Source Code | No | Upon acceptance of this manuscript, we intend to make the source code available in the camera-ready version of the paper. |
| Open Datasets | Yes | We utilize the real-world medical treatment dataset, i.e., Medical Information Mart for Intensive Care III (MIMIC-III) dataset [22]. |
| Dataset Splits | No | The paper mentions using '1000 random discrete MDPs' and evaluating results by computing 'means and standard deviations over 100 trajectories' but does not specify explicit training, validation, or test dataset splits (e.g., percentages, sample counts, or predefined split references). |
| Hardware Specification | Yes | All experiments were conducted using Intel Cascade Lake processors, with 30 v CPUs and 120 GB memory on a system running Ubuntu 18.04. |
| Software Dependencies | No | The paper mentions 'Ubuntu 18.04' as the operating system, but does not provide specific version numbers for key software components such as programming languages, libraries, or frameworks (e.g., Python, PyTorch, TensorFlow versions). |
| Experiment Setup | Yes | We provide in Appendix D more details on the experiment setup. |