Represent to Control Partially Observed Systems: Representation Learning with Provable Sample Efficiency
Authors: Lingxiao Wang, Qi Cai, Zhuoran Yang, Zhaoran Wang
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | For a class of POMDPs with a low-rank structure in the transition kernel, RTC attains an O(1/ϵ2) sample complexity that scales polynomially with the horizon and the intrinsic dimension (that is, the rank). Here ϵ is the optimality gap. To our best knowledge, RTC is the first sample-efficient algorithm that bridges representation learning and policy optimization in POMDPs with infinite observation and state spaces. |
| Researcher Affiliation | Academia | Lingxiao Wang , Qi Cai Department of Industrial Engineering and Management Sciences Northwestern University {lingxiaowang2022,qicai2022}@northwestern.edu Zhuoran Yang Department of Statistics and Data Science Yale University zhuoran.yang@yale.edu Zhaoran Wang Department of Industrial Engineering and Management Sciences Northwestern University zhaoranwang@gmail.com |
| Pseudocode | Yes | Algorithm 1 Represent to Control |
| Open Source Code | No | The paper does not provide any concrete access information for open-source code. |
| Open Datasets | No | The paper does not provide concrete access information for a publicly available or open dataset. This paper is theoretical and does not include empirical experiments. |
| Dataset Splits | No | The paper does not provide specific dataset split information. This paper is theoretical and does not include empirical experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details. This paper is theoretical and does not include empirical experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details. This paper is theoretical and does not include empirical experiments. |
| Experiment Setup | No | The paper does not contain specific experimental setup details. This paper is theoretical and does not include empirical experiments. |