Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
Authors: Anders Jonsson, Emilie Kaufmann, Pierre Menard, Omar Darwiche Domingues, Edouard Leurent, Michal Valko
NeurIPS 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We consider random discounted MDPs with infinite horizon in which the maximal number B of successor states and the sparsity of rewards are controlled... For various values of the desired accuracy ε and of the corresponding planning horizon H = logγ(ε(1 γ)/2) (see Section 2), we run simulations on 200 random MDPs. |
| Researcher Affiliation | Collaboration | Anders Jonsson Universitat Pompeu Fabra EMAIL Emilie Kaufmann CNRS & ULille (CRISt AL), Inria Scool EMAIL Pierre Ménard Inria Lille, Scool team EMAIL Omar Darwiche Domingues Inria Lille, Scool team EMAIL Edouard Leurent Renault & Inria Lille, Scool team EMAIL Michal Valko Deep Mind Paris EMAIL |
| Pseudocode | Yes | A generic implementation of MDP-Gap E is given in Algorithm 1 in Appendix A, where we also discuss some implementation details. |
| Open Source Code | Yes | 1The source code of our experiments is available at https://eleurent.github.io/ planning-gap-complexity/ |
| Open Datasets | No | The paper uses 'random discounted MDPs' and describes their generation process but does not provide access information (link, DOI, citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper describes running simulations on randomly generated MDPs but does not specify training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not specify any particular hardware details such as GPU/CPU models or memory specifications used for running experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., 'Python 3.8', 'PyTorch 1.9'). |
| Experiment Setup | Yes | Table 3b: MDP-Gap E parameters Discount factor γ 0.7 Confidence level δ 0.1 Exploration function βr(nt h, δ) log 1/δ + log nt h Exploration function βp(nt h, δ) log 1/δ + log nt h |