On the Role of General Function Approximation in Offline Reinforcement Learning
Authors: Chenjie Mao, Qiaosheng Zhang, Zhen Wang, Xuelong Li
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study offline reinforcement learning (RL) with general function approximation. General function approximation is a powerful tool for algorithm design and analysis, but its adaptation to offline RL encounters several challenges due to varying approximation targets and assumptions that blur the real meanings of function assumptions. In this paper, we try to formulate and clarify the treatment of general function approximation in offline RL in two aspects: (1) analyzing different types of assumptions and their practical usage, and (2) understanding its role as a restriction on underlying MDPs from information-theoretic perspectives. Additionally, we introduce a new insight for lower bound establishing: one can exploit modelrealizability to establish general-purpose lower bounds that can be generalized into other functions. Building upon this insight, we propose two generic lower bounds that contribute to a better understanding of offline RL with general function approximation. |
| Researcher Affiliation | Collaboration | Chenjie Mao1 2, Qiaosheng Zhang1, Zhen Wang3, Xuelong Li4 1Shanghai Artificial Intelligence Laboratory, 2Huazhong University of Science and Technology 3Northwestern Polytechnical University 4Institute of Artificial Intelligence (Tele AI), China Telecom Corp Ltd |
| Pseudocode | No | The paper describes theoretical concepts, proofs, and analyses of lower bounds in offline RL. It does not include any pseudocode blocks or formal algorithm descriptions. |
| Open Source Code | No | The paper does not contain any statements about making source code available or provide links to a code repository. |
| Open Datasets | No | The paper is a theoretical work focusing on lower bounds in offline RL. While it discusses 'dataset D' in a conceptual framework for theoretical analysis, it does not use any specific, named datasets for empirical training or evaluation, nor does it provide access information for such datasets. |
| Dataset Splits | No | The paper is theoretical and does not involve empirical experiments with specific dataset splits for training, validation, or testing. It discusses theoretical properties of function approximation and lower bounds. |
| Hardware Specification | No | The paper is purely theoretical and focuses on mathematical proofs and analyses of lower bounds in offline reinforcement learning. It does not describe any computational experiments or refer to any hardware used for such purposes. |
| Software Dependencies | No | The paper is theoretical and does not present any empirical experiments. Therefore, it does not list any software dependencies with version numbers required to reproduce experimental results. |
| Experiment Setup | No | The paper is a theoretical work that establishes lower bounds and analyzes function approximation. It does not describe any empirical experimental setup, hyperparameters, or training configurations. |