Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Provably efficient multi-task reinforcement learning with model transfer
Authors: Chicheng Zhang, Zhi Wang
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study multi-task reinforcement learning (RL) in tabular episodic Markov decision processes (MDPs). We formulate a heterogeneous multi-player RL problem, in which a group of players concurrently face similar but not necessarily identical MDPs, with a goal of improving their collective performance through inter-player information sharing. We design and analyze an algorithm based on the idea of model transfer, and provide gap-dependent and gap-independent upper and lower bounds that characterize the intrinsic complexity of the problem. |
| Researcher Affiliation | Academia | Chicheng Zhang University of Arizona EMAIL Zhi Wang University of California San Diego EMAIL |
| Pseudocode | Yes | Algorithm 1: MULTI-TASK-EULER |
| Open Source Code | No | No explicit statement or link for open-source code for the described methodology was found. |
| Open Datasets | No | As a theoretical paper, no specific dataset is used for training or evaluation. The paper describes a problem setting in tabular episodic Markov decision processes (MDPs) rather than using a concrete dataset. |
| Dataset Splits | No | As a theoretical paper, there is no mention of train/validation/test dataset splits. |
| Hardware Specification | No | As a theoretical paper, no hardware specifications for running experiments are mentioned. |
| Software Dependencies | No | As a theoretical paper, no specific software dependencies with version numbers for experimental replication are mentioned. |
| Experiment Setup | No | As a theoretical paper, no specific experimental setup details such as hyperparameters or training configurations are provided. |