Provable Benefits of Multi-task RL under Non-Markovian Decision Making Processes

Authors: Ruiquan Huang, Yuan Cheng, Jing Yang, Vincent Tan, Yingbin Liang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical Our work is the first theoretical study that characterizes the benefits of multi-task RL with PSRs/POMDPs over its single-task counterpart.
Researcher Affiliation Academia Penn State University, State College, PA 16801, USA. {rzh514,yangjing}@psu.edu National University of Singapore, 119077, Singapore. yuan.cheng@u.nus.edu, vtan@nus.edu.sg Ohio State University, Columbus, OH 43210, USA. liang.889@osu.edu.
Pseudocode Yes Algorithm 1 Upstream Multi-Task PSRs (UMT-PSR)
Open Source Code No The paper does not provide an explicit statement of open-source code release, nor does it include any links to code repositories.
Open Datasets No This is a theoretical paper that does not perform empirical studies with specific datasets. While it discusses 'data collection' in the context of its proposed algorithm, it does not use or provide access information for a publicly available training dataset.
Dataset Splits No This is a theoretical paper and does not involve empirical experiments with dataset splits. Therefore, it does not provide specific training, validation, or test dataset split information.
Hardware Specification No This is a theoretical paper focusing on algorithm design and theoretical guarantees, thus it does not describe any specific hardware used for experiments.
Software Dependencies No This is a theoretical paper that proposes algorithms and provides theoretical analysis. It does not describe implementation details or specific software dependencies with version numbers.
Experiment Setup No This paper is theoretical and focuses on mathematical proofs and algorithm design. It does not describe an experimental setup with specific hyperparameters or training configurations.