Invariant Causal Prediction for Block MDPs
Authors: Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precup
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We give empirical evidence that our methods work in both linear and nonlinear settings, attaining improved generalization over singleand multi-task baselines. |
| Researcher Affiliation | Collaboration | 1Mc Gill University 2Mila 3Facebook AI Research 4University of Oxford 5Deepmind. |
| Pseudocode | Yes | Algorithm 1 Linear MISA Algorithm 2 Nonlinear Model-irrelevance State Abstraction (MISA) Learning |
| Open Source Code | Yes | Code is available at https://github.com/facebookresearch/ icp-block-mdp. |
| Open Datasets | Yes | We randomly initialize the background color of two train environments from Deepmind Control (Tassa et al., 2018) from range [0, 255]. |
| Dataset Splits | No | The paper describes training and test environments but does not explicitly mention a separate validation set or how data was split for validation purposes. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU models, or memory specifications. |
| Software Dependencies | No | The paper does not specify the versions of any software dependencies (e.g., libraries, frameworks) used in the experiments. |
| Experiment Setup | Yes | Implementation details found in Appendix C.1. Implementation details and more information about Soft Actor Critic can be found in Appendix C.2. Additional plots with the hyperparameter sweep done to find a good penalty weight for IRM can also be found Appendix C.3. |