reproducibilityindex.ai

Online Decision Transformer

Authors: Qinqing Zheng, Amy Zhang, Aditya Grover

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirically, we validate our overall framework by comparing its performance with state-of-the-art algorithms on the D4RL benchmark (Fu et al., 2020). We find that our relative improvements due to our finetuning strategy outperform other baselines (Nair et al., 2020; Kostrikov et al., 2021b), while exhibiting competitive absolute performance when accounting for the pretraining results of the base model. Finally, we supplement our main results with rigorous ablations and additional experimental designs to justify and validate the key components of our approach.
Researcher Affiliation	Collaboration	1Meta AI Research 2University of California, Berkeley 3University of California, Los Angeles.
Pseudocode	Yes	Algorithm 1: Online Decision Transformer; Algorithm 2: ODT Training
Open Source Code	No	No explicit statement by the authors providing their own source code for the methodology was found. The paper mentions: "We use the official Pytorch implmentation2 for DT, the official JAX implementation3 for IQL, and the Pytorch implementation4 (Yarats & Kostrikov, 2020) for SAC." This refers to third-party code for baselines. The link "For more results, visit https://sites.google.com/view/onlinedt/home." is a project homepage, not a code repository.
Open Datasets	Yes	For answering both these questions, we focus on two types of tasks with offline datasets from the D4RL benchmark (Fu et al., 2020).
Dataset Splits	No	The paper does not provide specific training/validation/test dataset splits. It discusses
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, cloud instances) used for running experiments are mentioned.
Software Dependencies	No	The paper mentions software components like "Pytorch," "JAX," "LAMB optimizer," and "Adam optimzier," but does not specify version numbers for any of them.
Experiment Setup	Yes	The complete list of hyperparameters of ODT are summarized in Appendix C. Table C.1 lists the common hyperparameters and Table C.2 lists the domain specific ones. For all the experiments, we optimize the policy parameter θ by the LAMB optimizer (You et al., 2019)...