FLAMBE: Structural Complexity and Representation Learning of Low Rank MDPs

Authors: Alekh Agarwal, Sham Kakade, Akshay Krishnamurthy, Wen Sun

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Algorithmically, we develop FLAMBE, which engages in exploration and representation learning for provably efficient RL in low rank transition models. On a technical level, our analysis eliminates reachability assumptions that appear in prior results on the simpler block MDP model and may be of independent interest.
Researcher Affiliation Collaboration Alekh Agarwal Microsoft Research, Redmond alekha@microsoft.com Sham Kakade Microsoft Research, NYC sham@cs.washington.edu Akshay Krishnamurthy Microsoft Research, NYC akshaykr@microsoft.com Wen Sun Cornell University ws455@cornell.edu
Pseudocode Yes Algorithm 1 FLAMBE: Feature Learning And Model-Based Exploration; Algorithm 2 Elliptical planner
Open Source Code No The paper does not contain any explicit statements about releasing source code for the described methodology, nor does it provide links to a code repository.
Open Datasets No The paper focuses on theoretical contributions, algorithm design, and proofs. It does not conduct empirical experiments using datasets, and therefore, does not provide concrete access information for publicly available or open datasets for training.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with datasets. Therefore, it does not provide information on training, validation, or test dataset splits.
Hardware Specification No The paper is theoretical and does not describe any empirical experiments that would require specific hardware for execution. Therefore, no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and focuses on algorithm design and analysis. It does not describe an implementation of the algorithms or experiments that would necessitate listing specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and primarily focuses on the development and analysis of algorithms for low-rank MDPs. It does not include an empirical experimental setup with details such as hyperparameters or training configurations.