reproducibilityindex.ai

Provably Efficient Causal Model-Based Reinforcement Learning for Systematic Generalization

Authors: Mirco Mutti, Riccardo De Santi, Emanuele Rossi, Juan Felipe Calderon, Michael Bronstein, Marcello Restelli

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5 Numerical Validation We empirically validate the theoretical ﬁndings of this work by experimenting on a synthetic example where each environment is a person, and the MDP represents how a series of actions the person can take inﬂuences their weight (W) and academic performance (A).
Researcher Affiliation	Collaboration	1Politecnico di Milano 2Universit a di Bologna 3ETH Zurich 4Imperial College London 5Twitter 6University of Oxford
Pseudocode	Yes	Algorithm 1 Causal Transition Model Estimation, Algorithm 2 MDP Causal Structure Estimation, Algorithm 3 MDP Bayesian Network Estimation
Open Source Code	No	The paper states 'The appendix of this paper can be found at https://arxiv.org/abs/2202.06545.', which points to the paper's appendix, not an explicit code repository or a statement about code release for the methodology.
Open Datasets	No	The paper states 'We empirically validate the theoretical ﬁndings of this work by experimenting on a synthetic example...' and refers to 'Appendix B for details on how transition models of different environments are generated.' This indicates a generated dataset, but no concrete access information (link, DOI, repository) for public availability is provided.
Dataset Splits	No	The paper mentions using 'A class M of 3 environments is used to estimate the causal transition model' for validation, but does not provide specific percentages or counts for training, validation, or test splits of the data.
Hardware Specification	No	The paper describes its numerical validation in Section 5 but does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	The paper does not mention any specific software dependencies with version numbers (e.g., programming languages, libraries, frameworks) used for the experiments.
Experiment Setup	No	The paper states 'All experiments are repeated 10 times' and mentions sample counts K from Algorithm 1, but does not provide specific experimental setup details such as hyperparameter values (e.g., learning rate, batch size, epochs) or optimizer settings.