reproducibilityindex.ai

Offline Meta Reinforcement Learning with In-Distribution Online Adaptation

Authors: Jianhao Wang, Jin Zhang, Haozhe Jiang, Junyu Zhang, Liwei Wang, Chongjie Zhang

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that IDAQ achieves state-of-the-art performance on the Meta-World ML1 benchmark compared to baselines with/without offline adaptation.Empirical results show that IDAQ significantly outperforms baselines with fast online adaptation, and achieves better or comparable performance than offline adaptation baselines with expert context.
Researcher Affiliation	Academia	1Institute for Interdisciplinary Information Sciences, Tsinghua University 2Huazhong University of Science and Technology 3Institute of Artificial Intelligence, Peking University.
Pseudocode	Yes	The overall algorithm of IDAQ is illustrated in Algorithm 1.
Open Source Code	Yes	An open-source implementation of our algorithm is available online1. 1https://github.com/Nagisa Zj/IDAQ_Public
Open Datasets	Yes	We extensively evaluate the performance of IDAQ in didactic problems proposed by prior work (Rakelly et al., 2019; Zhang et al., 2021) and Meta-World ML1 benchmark with 50 tasks (Yu et al., 2020b).
Dataset Splits	No	For each task set, we use 40 tasks as meta-training tasks, and remain the other 10 tasks as meta-testing tasks. The paper does not explicitly mention a separate validation dataset split.
Hardware Specification	No	The paper does not provide specific hardware details such as GPU/CPU models, memory, or specific cloud instances used for running experiments.
Software Dependencies	No	The paper mentions software components like 'optimizer adam' and network structures but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup	Yes	Table 4 shows hyper-parameter settings for the task sets used in our experiments. and Table 5 shows IDAQ s hyper-parameter settings.