FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs
Authors: Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | Algorithmically, we develop FLAMBE, which engages in exploration and representation learning for provably efficient RL in low rank transition models. On a technical level, our analysis eliminates reachability assumptions that appear in prior results on the simpler block MDP model and may be of independent interest. |
| Researcher Affiliation | Collaboration | Alekh Agarwal Microsoft Research, Redmond alekha@microsoft.com Sham Kakade Microsoft Research, NYC sham@cs.washington.edu Akshay Krishnamurthy Microsoft Research, NYC akshaykr@microsoft.com Wen Sun Cornell University ws455@cornell.edu |
| Pseudocode | Yes | Algorithm 1 FLAMBE: Feature Learning And Model-Based Exploration; Algorithm 2 Elliptical planner |
| Open Source Code | No | The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to a code repository. |
| Open Datasets | No | The paper focuses on theoretical contributions, algorithm design, and proofs. It does not conduct empirical experiments using datasets, and therefore, does not provide concrete access information for publicly available or open datasets for training. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments with datasets. Therefore, it does not provide information on training, validation, or test dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any empirical experiments that would require specific hardware for execution. Therefore, no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and focuses on algorithm design and analysis. It does not describe an implementation of the algorithms or experiments that would necessitate listing specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and primarily focuses on the development and analysis of algorithms for low-rank MDPs. It does not include an empirical experimental setup with details such as hyperparameters or training configurations. |