Efficient PAC Reinforcement Learning in Regular Decision Processes

Authors: Alessandro Ronca, Giuseppe De Giacomo

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our main contribution is to show that a near-optimal policy can be PAC-learned in polynomial time in a set of parameters that describe the underlying decision process. We present an algorithm that computes a near-optimal policy with high confidence, in a number of steps that is polynomial in the required accuracy and confidence, and in a set of parameters that describe the underlying RDP.
Researcher Affiliation Academia Alessandro Ronca and Giuseppe De Giacomo DIAG, Sapienza University of Rome, Italy {ronca,degiacomo}@diag.uniroma1.it
Pseudocode Yes Algorithm 1 Reinforcement Learning RL(A, γ, ϵ, δ) and Algorithm 2 Reinforcement Learning RL(A, γ, ϵ, δ, ˆn)
Open Source Code No The paper does not include any statement or link about the availability of open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not conduct experiments with a specific dataset, therefore no access information for a training dataset is provided.
Dataset Splits No The paper is theoretical and does not conduct experiments that would require specific training/validation/test dataset splits.
Hardware Specification No The paper is theoretical and does not mention any specific hardware used for running experiments.
Software Dependencies No The paper mentions algorithms (e.g., Ada CT algorithm, Value Iteration) but does not provide specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and focuses on algorithm design and PAC analysis; it does not describe specific experimental setup details like hyperparameters or training configurations.