The Role of Coverage in Online Reinforcement Learning

Authors: Tengyang Xie, Dylan J Foster, Yu Bai, Nan Jiang, Sham M. Kakade

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical While our results primarily concern analysis of existing algorithms rather than algorithm design, they highlight a number of exciting directions for future research, and we are optimistic that the notion of coverability can guide the design of practical algorithms going forward.
Researcher Affiliation Collaboration Tengyang Xie UIUC tx10@illinois.edu Dylan J. Foster Microsoft Research dylanfoster@microsoft.com Yu Bai Salesforce Research yu.bai@salesforce.com Nan Jiang UIUC nanjiang@illinois.edu Sham M. Kakade Harvard University sham@seas.harvard.edu
Pseudocode Yes Algorithm 1 GOLF (Jin et al., 2021a) input: Function class F, confidence width β >0. [...] Algorithm 2 Reward-Free Exploration with GOLF [...] Algorithm 3 Offline GOLF with Exploration Data and Target Reward
Open Source Code No The paper does not provide any statement about making its code open-source or provide a link to a code repository.
Open Datasets No As a theoretical paper focused on analysis and proofs, it does not conduct experiments on datasets, thus no training data is mentioned.
Dataset Splits No As a theoretical paper focused on analysis and proofs, it does not describe experimental validation on data, thus no dataset splits are provided.
Hardware Specification No As a theoretical paper, it does not describe any experimental setup or the hardware used for computations.
Software Dependencies No As a theoretical paper, it does not describe any experimental setup or specific software dependencies with version numbers.
Experiment Setup No As a theoretical paper, it does not describe an experimental setup with hyperparameters or training settings.