Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Efficient PAC Reinforcement Learning in Regular Decision Processes
Authors: Alessandro Ronca, Giuseppe De Giacomo
IJCAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Our main contribution is to show that a near-optimal policy can be PAC-learned in polynomial time in a set of parameters that describe the underlying decision process. We present an algorithm that computes a near-optimal policy with high confidence, in a number of steps that is polynomial in the required accuracy and confidence, and in a set of parameters that describe the underlying RDP. |
| Researcher Affiliation | Academia | Alessandro Ronca and Giuseppe De Giacomo DIAG, Sapienza University of Rome, Italy EMAIL |
| Pseudocode | Yes | Algorithm 1 Reinforcement Learning RL(A, γ, ϵ, δ) and Algorithm 2 Reinforcement Learning RL(A, γ, ϵ, δ, ˆn) |
| Open Source Code | No | The paper does not include any statement or link about the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments with a specific dataset, therefore no access information for a training dataset is provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments that would require specific training/validation/test dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not mention any specific hardware used for running experiments. |
| Software Dependencies | No | The paper mentions algorithms (e.g., Ada CT algorithm, Value Iteration) but does not provide specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithm design and PAC analysis; it does not describe specific experimental setup details like hyperparameters or training configurations. |