Elastic Decision Transformer

Authors: Yueh-Hua Wu, Xiaolong Wang, Masashi Hamaya

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimentation demonstrates EDT s ability to bridge the performance gap between DT-based and Q Learning-based approaches. In particular, the EDT outperforms Q Learning-based methods in a multi-task regime on the D4RL locomotion benchmark and Atari games.
Researcher Affiliation Collaboration Yueh-Hua Wu12 Xiaolong Wang1 Masashi Hamaya2 1UC San Diego 2OMRON SINIC X
Pseudocode Yes Algorithm 1 EDT optimal history length search
Open Source Code No We are committed to releasing our code.
Open Datasets Yes Extensive experimentation demonstrates EDT s ability to bridge the performance gap between DT-based and Q Learning-based approaches. In particular, the EDT outperforms Q Learning-based methods in a multi-task regime on the D4RL locomotion benchmark and Atari games.
Dataset Splits No The paper describes using D4RL medium and medium-replay datasets and mentions 'Mean of 5 random training initialization seeds, 100 evaluations each', but does not explicitly define specific train/validation/test dataset splits with percentages or sample counts for reproducibility.
Hardware Specification Yes GPU 3090 Ti V100
Software Dependencies No The paper mentions 'Optimizer Adam W' and other high-level components, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries.
Experiment Setup Yes Table 6: Hper-parameters. We list all hyper-parameters for D4RL and Atari games below. Parameter Setting (D4RL) Setting (Atari) ... Maximum history length 20 30 inverse temperature (κ) 10 10 Expectile level 0.99 0.99 Batch size 256 256 Step size (δ) 2 2