Efficient Model-Free Exploration in Low-Rank MDPs
Authors: Zak Mhammedi, Adam Block, Dylan J Foster, Alexander Rakhlin
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | While a detailed experimental evaluation is outside of the scope of this paper, we are optimistic about the empirical performance of the algorithm in light of the encouraging results based on the same objective in Zhang et al. [49] |
| Researcher Affiliation | Collaboration | Zakaria Mhammedi MIT mhammedi@mit.edu Adam Block MIT ablock@mit.edu Dylan J. Foster Microsoft Research dylanfoster@microsoft.com Alexander Rakhlin MIT rakhlin@mit.edu |
| Pseudocode | Yes | Algorithm 1 Span RL: Volumetric Exploration and Representation Learning via Barycentric Spanner; Algorithm 2 Robust Spanner: Barycentric Spanner via Approximate Linear Optimization; Algorithm 3 PSDP: Policy Search by Dynamic Programming; Algorithm 4 Est Vec: Estimate Eπ[F(xh,ah)] for policy π and function F; Algorithm 5 Rep Learn: Representation Learning for Low-Rank MDPs; Algorithm 6 Rep Learn: Representation Learning for Low-Rank MDPs; Algorithm 7 Est Vec: Estimate Eπ[F(xh,ah)] for given policy π and function F. |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository for their methodology. |
| Open Datasets | No | The paper is theoretical and focuses on algorithm design and theoretical guarantees for Low-Rank MDPs. It does not mention using specific datasets for empirical training or provide access information for any dataset. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments with specific datasets, therefore it does not mention training, validation, or test splits. |
| Hardware Specification | No | The paper is theoretical and focuses on algorithm design and analysis. It does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithm design and analysis. It mentions computational methods like "standard gradient-based optimization techniques" but does not specify any software names with version numbers. |
| Experiment Setup | No | The paper specifies theoretical parameters like ε, c, and n for its algorithms and their complexity analysis, but these are not concrete experimental setup details such as learning rates, batch sizes, or optimizer settings for empirical evaluation. |