Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization
Authors: Mirco Mutti, Riccardo De Santi, Emanuele Rossi, Juan Felipe Calderon, Michael Bronstein, Marcello Restelli
AAAI 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Numerical Validation We empirically validate the theoretical findings of this work by experimenting on a synthetic example where each environment is a person, and the MDP represents how a series of actions the person can take influences their weight (W) and academic performance (A). |
| Researcher Affiliation | Collaboration | 1Politecnico di Milano 2Universit a di Bologna 3ETH Zurich 4Imperial College London 5Twitter 6University of Oxford |
| Pseudocode | Yes | Algorithm 1 Causal Transition Model Estimation, Algorithm 2 MDP Causal Structure Estimation, Algorithm 3 MDP Bayesian Network Estimation |
| Open Source Code | No | The paper states 'The appendix of this paper can be found at https://arxiv.org/abs/2202.06545.', which points to the paper's appendix, not an explicit code repository or a statement about code release for the methodology. |
| Open Datasets | No | The paper states 'We empirically validate the theoretical findings of this work by experimenting on a synthetic example...' and refers to 'Appendix B for details on how transition models of different environments are generated.' This indicates a generated dataset, but no concrete access information (link, DOI, repository) for public availability is provided. |
| Dataset Splits | No | The paper mentions using 'A class M of 3 environments is used to estimate the causal transition model' for validation, but does not provide specific percentages or counts for training, validation, or test splits of the data. |
| Hardware Specification | No | The paper describes its numerical validation in Section 5 but does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks) used for the experiments. |
| Experiment Setup | No | The paper states 'All experiments are repeated 10 times' and mentions sample counts K from Algorithm 1, but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, epochs) or optimizer settings. |