Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency

Authors: Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical For a class of POMDPs with a low-rank structure in the transition kernel, RTC attains an O(1/ϵ2) sample complexity that scales polynomially with the horizon and the intrinsic dimension (that is, the rank). Here ϵ is the optimality gap. To our best knowledge, RTC is the first sample-efficient algorithm that bridges representation learning and policy optimization in POMDPs with infinite observation and state spaces.
Researcher Affiliation Academia Lingxiao Wang , Qi Cai Department of Industrial Engineering and Management Sciences Northwestern University {lingxiaowang2022,qicai2022}@northwestern.edu Zhuoran Yang Department of Statistics and Data Science Yale University zhuoran.yang@yale.edu Zhaoran Wang Department of Industrial Engineering and Management Sciences Northwestern University zhaoranwang@gmail.com
Pseudocode Yes Algorithm 1 Represent to Control
Open Source Code No The paper does not provide any concrete access information for open-source code.
Open Datasets No The paper does not provide concrete access information for a publicly available or open dataset. This paper is theoretical and does not include empirical experiments.
Dataset Splits No The paper does not provide specific dataset split information. This paper is theoretical and does not include empirical experiments.
Hardware Specification No The paper does not provide specific hardware details. This paper is theoretical and does not include empirical experiments.
Software Dependencies No The paper does not provide specific ancillary software details. This paper is theoretical and does not include empirical experiments.
Experiment Setup No The paper does not contain specific experimental setup details. This paper is theoretical and does not include empirical experiments.