Prospective Side Information for Latent MDPs
Authors: Jeongyeol Kwon, Yonathan Efroni, Shie Mannor, Constantine Caramanis
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We then establish that any sample efficient algorithm must suffer at least Ω(K2/3)-regret, as opposed to standard Ω(K) lower bounds. We design an algorithm with a matching upper bound that depends only polynomially on the problem parameters. In this section, we present our algorithmic results as well as lower bound analysis. |
| Researcher Affiliation | Collaboration | 1Wisconsin Institute for Discovery, Wisconsin, USA 2Meta AI, New York, USA 3Electrical Engineering, Technion, Haifa, Israel 4NVIDIA 5Electrical and Computer Engineering, University of Texas at Austin, Texas, USA. |
| Pseudocode | Yes | Algorithm 1 Regret Minimization within Πblind, Algorithm 2 Pure Exploration for LMDP-Ψ |
| Open Source Code | No | The paper does not contain any statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not describe experiments or use any specific dataset. |
| Dataset Splits | No | The paper is theoretical and does not describe experiments or specify dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper is theoretical and does not describe experiments or specify any hardware used. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not describe experiments or provide details on experimental setup, hyperparameters, or training configurations. |