Guarantees for Epsilon-Greedy Reinforcement Learning with Function Approximation

Authors: Chris Dann, Yishay Mansour, Mehryar Mohri, Ayush Sekhari, Karthik Sridharan

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical This paper presents a theoretical analysis of such policies and provides the first regret and sample-complexity bounds for reinforcement learning with myopic exploration.
Researcher Affiliation Collaboration 1Google Research 2Tel Aviv University 3Courant Institute of Mathematical Sciences 4Cornell University.
Pseudocode Yes Algorithm 1: RL with myopic exploration
Open Source Code No The paper does not provide any statement or link indicating that its source code is open or available.
Open Datasets No The paper is theoretical and does not use or refer to any specific publicly available datasets.
Dataset Splits No The paper is theoretical and does not describe any training, validation, or test dataset splits.
Hardware Specification No The paper does not mention any specific hardware used for experiments, consistent with a theoretical work.
Software Dependencies No The paper does not list any specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not describe a concrete experimental setup with hyperparameters or system-level training settings.