State-free Reinforcement Learning
Authors: Mingyu Chen, Aldo Pacchiano, Xuezhou Zhang
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this work, we study the state-free RL problem, where the algorithm does not have the states information before interacting with the environment. Specifically, denote the reachable state set by SΠ := {s| maxπ Π q P,π(s) > 0}, we design an algorithm which requires no information on the state space S while having a regret that is completely independent of S and only depend on SΠ. We view this as a concrete first step towards parameter-free RL, with the goal of designing RL algorithms that require no hyper-parameter tuning. |
| Researcher Affiliation | Academia | Mingyu Chen Boston University mingyuc@bu.edu Aldo Pacchiano Boston University Broad Institute of MIT and Harvard pacchian@bu.edu Xuezhou Zhang Boston University xuezhouz@bu.edu |
| Pseudocode | Yes | Algorithm 1 Black-box Reduction for State-free RL (SF-RL) |
| Open Source Code | No | The paper does not contain any explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | No | This is a theoretical paper focused on algorithm design and regret analysis for MDPs. It does not use real-world datasets for training or experimentation, and therefore no public dataset information is provided. |
| Dataset Splits | No | This is a theoretical paper focused on algorithm design and regret analysis for MDPs. It does not conduct empirical experiments with data, and thus no validation splits are mentioned. |
| Hardware Specification | No | This is a theoretical paper and does not involve empirical experiments. Therefore, no hardware specifications are mentioned for running experiments. |
| Software Dependencies | No | This is a theoretical paper and does not involve empirical experiments. Therefore, no specific software dependencies with version numbers are mentioned for replicating experimental setups. |
| Experiment Setup | No | This is a theoretical paper and does not involve empirical experiments. Therefore, no specific experimental setup details, such as hyperparameters or training settings, are provided. |