Causal Imitation Learning With Unobserved Confounders
Authors: Junzhe Zhang, Daniel Kumor, Elias Bareinboim
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate our algorithms on several synthetic datasets, including high D [18] consisting of natural trajectories of human driven vehicles, and on MNIST digits. In all experiments, we test our causal imitation method (ci): we apply Thm. 2 when there exists an -backdoor admissible set; otherwise, Alg. 1 is used to leverage the observational distribution. As a baseline, we also include naïve behavior cloning (bc) that mimics the observed conditional distribution P(x|pa( )), as well as the actual reward distribution generated by an expert (opt). We found that our algorithms consistently imitate distributions over the expert s reward in imitable (p-imitable) cases; and p-imitable instances commonly exist. |
| Researcher Affiliation | Academia | Junzhe Zhang Columbia University junzhez@cs.columbia.edu Daniel Kumor Purdue University dkumor@purdue.edu Elias Bareinboim Columbia University eb@cs.columbia.edu |
| Pseudocode | Yes | Algorithm 1: IMITATE 1: Input: G, , P(o). 2: while LISTIDSPACE(G [ {Y }, , Y ) outputs a policy subspace 0 do 3: while LISTMINSEP(G [ 0, ˆX, Y , {}, O) outputs a surrogate set S do 4: if IDENTIFY(G, 0, S) = YES then 5: Solve for a policy 2 0 such that P(s|do( ); M) = P(s) for any POSCM M 2 Mh G,P i. 6: Return if it exists; continue otherwise. 7: end if 8: end while 9: end while Return FAIL. |
| Open Source Code | No | The paper does not provide an explicit statement or link for open-source code for the described methodology. |
| Open Datasets | Yes | We demonstrate our algorithms on several synthetic datasets, including high D [18] consisting of natural trajectories of human driven vehicles, and on MNIST digits. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not explicitly describe the hardware specifications used for running experiments. |
| Software Dependencies | No | The paper mentions training 'GANs' but does not provide specific version numbers for any software dependencies like libraries or frameworks. |
| Experiment Setup | No | The paper mentions training GANs and using neural networks, but it does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, epochs) or optimizer settings. |