reproducibilityindex.ai

Critic-Guided Decision Transformer for Offline Reinforcement Learning

Authors: Yuanfu Wang, Chao Yang, Ying Wen, Yu Liu, Yu Qiao

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluations on stochastic environments and D4RL benchmark datasets demonstrate the superiority of CGDT over traditional RCSL methods. These results highlight the potential of CGDT to advance the state of the art in offline RL and extend the applicability of RCSL to a wide range of RL tasks.
Researcher Affiliation	Collaboration	Yuanfu Wang1, 2, Chao Yang2, Ying Wen 1, Yu Liu2, 3, Yu Qiao 2 1Shanghai Jiao Tong University 2Shanghai Artificial Intelligence Laboratory 3Sense Time Research
Pseudocode	Yes	Algorithm 1: Critic-Guided Decision Transformer
Open Source Code	No	The paper does not contain an explicit statement or a direct link indicating that the source code for the methodology is publicly available.
Open Datasets	Yes	We conduct further experiments on the D4RL datasets (Fu et al. 2020).
Dataset Splits	No	While the paper mentions utilizing "validation errors as a means to detect overfitting during critic training," it does not provide specific details regarding the percentages, counts, or methodology for training, validation, and test dataset splits.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies, including libraries, frameworks, or their version numbers, that are needed to replicate the experiments.
Experiment Setup	Yes	The algorithm implementation details are summarized in Algorithm 1. Initially, we set the hyperparameters τc and τp to 0.5. By varying τc and τp within the range of [0.3, 0.7], we control the asymmetries during critic training and policy training, respectively.