reproducibilityindex.ai

Reinforcement Learning Under Latent Dynamics: Toward Statistical and Algorithmic Modularity

Authors: Philip Amortila, Dylan J Foster, Nan Jiang, Akshay Krishnamurthy, Zak Mhammedi

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	This paper addresses the question of reinforcement learning under general latent dynamics from a statistical and algorithmic perspective. On the statistical side, our main negative result shows that most well-studied settings for reinforcement learning with function approximation become intractable when composed with rich observations; we complement this with a positive result, identifying latent pushforward coverability as a general condition that enables statistical tractability. Algorithmically, we develop provably efficient observable-to-latent reductions
Researcher Affiliation	Collaboration	Philip Amortila philipa4@illinois.edu Dylan J. Foster dylanfoster@microsoft.com Nan Jiang nanjiang@illinois.edu Akshay Krishnamurthy akshaykr@microsoft.com Zakaria Mhammedi mhammedi@google.com
Pseudocode	Yes	Algorithm 1 O2L: Observable-to-Latent Reduction, Algorithm 2 GOLF [JLM21], Algorithm 3 Derandomized Exponential Weights (EXPWEIGHTS.DR), Algorithm 4 Optimistic Self-Predictive Latent Model Estimation (SELFPREDICT.OPT)
Open Source Code	No	The paper is a theoretical work focusing on statistical and algorithmic modularity. It does not contain any statements about releasing code, nor does it provide links to code repositories.
Open Datasets	No	The paper is theoretical and does not conduct experiments that would require a dataset. It discusses abstract 'MDP classes' rather than specific datasets.
Dataset Splits	No	The paper is theoretical and does not involve experimental data splits for training, validation, or testing.
Hardware Specification	No	This is a theoretical paper and does not describe any specific hardware used for experiments.
Software Dependencies	No	This is a theoretical paper and does not list any software dependencies with specific version numbers relevant to experimental replication.
Experiment Setup	No	The paper is theoretical and does not describe an experimental setup including specific hyperparameter values or training configurations.