On the Role of General Function Approximation in Offline Reinforcement Learning

Authors: Chenjie Mao, Qiaosheng Zhang, Zhen Wang, Xuelong Li

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We study offline reinforcement learning (RL) with general function approximation. General function approximation is a powerful tool for algorithm design and analysis, but its adaptation to offline RL encounters several challenges due to varying approximation targets and assumptions that blur the real meanings of function assumptions. In this paper, we try to formulate and clarify the treatment of general function approximation in offline RL in two aspects: (1) analyzing different types of assumptions and their practical usage, and (2) understanding its role as a restriction on underlying MDPs from information-theoretic perspectives. Additionally, we introduce a new insight for lower bound establishing: one can exploit modelrealizability to establish general-purpose lower bounds that can be generalized into other functions. Building upon this insight, we propose two generic lower bounds that contribute to a better understanding of offline RL with general function approximation.
Researcher Affiliation Collaboration Chenjie Mao1 2, Qiaosheng Zhang1, Zhen Wang3, Xuelong Li4 1Shanghai Artificial Intelligence Laboratory, 2Huazhong University of Science and Technology 3Northwestern Polytechnical University 4Institute of Artificial Intelligence (Tele AI), China Telecom Corp Ltd
Pseudocode No The paper describes theoretical concepts, proofs, and analyses of lower bounds in offline RL. It does not include any pseudocode blocks or formal algorithm descriptions.
Open Source Code No The paper does not contain any statements about making source code available or provide links to a code repository.
Open Datasets No The paper is a theoretical work focusing on lower bounds in offline RL. While it discusses 'dataset D' in a conceptual framework for theoretical analysis, it does not use any specific, named datasets for empirical training or evaluation, nor does it provide access information for such datasets.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with specific dataset splits for training, validation, or testing. It discusses theoretical properties of function approximation and lower bounds.
Hardware Specification No The paper is purely theoretical and focuses on mathematical proofs and analyses of lower bounds in offline reinforcement learning. It does not describe any computational experiments or refer to any hardware used for such purposes.
Software Dependencies No The paper is theoretical and does not present any empirical experiments. Therefore, it does not list any software dependencies with version numbers required to reproduce experimental results.
Experiment Setup No The paper is a theoretical work that establishes lower bounds and analyzes function approximation. It does not describe any empirical experimental setup, hyperparameters, or training configurations.