Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Achieving Tractable Minimax Optimal Regret in Average Reward MDPs
Authors: Victor Boone, Zihan Zhang
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | 5 Experimental illustrations To get a grasp of how PMEVI-DT behaves in practice, we provide in Fig. 2 a first round of illustrative experiments. |
| Researcher Affiliation | Academia | Victor Boone EMAIL Univ. Grenoble Alpes, Inria, CNRS, Grenoble INP, LIG, 38000 Grenoble, France Zihan Zhang EMAIL Princeton University |
| Pseudocode | Yes | Algorithm 1: PMEVI-DT(H , T, t 7 Mt) Algorithm 2: PMEVI(M, β, Γ, ϵ) |
| Open Source Code | Yes | The code is provided in the supplementary material, together with the scripts to reproduce the exact figures of the paper. |
| Open Datasets | Yes | In both, the environment is a river-swim which is a model known to be hard to learn despite its size, with high diameter and bias span, see Appendix D for the model s description. |
| Dataset Splits | No | The paper describes an RL environment ('river-swim') where data is generated through interaction, rather than using static datasets with predefined train/validation/test splits. Therefore, the concept of dataset splits as requested by the question does not directly apply. |
| Hardware Specification | No | The paper states that experiments 'took less than a hour on a low end laptop' but does not provide specific hardware details such as CPU/GPU models, memory, or processor types. |
| Software Dependencies | No | The paper mentions that the code is 'mostly written in Python' but does not specify a Python version or any specific library names with their version numbers. |
| Experiment Setup | No | The paper describes the environment and some high-level experimental conditions (e.g., river-swim size, use of prior knowledge), but does not provide specific hyperparameters like learning rates, batch sizes, or optimizer settings for the experiments. |