Elastic Decision Transformer
Authors: Yueh-Hua Wu, Xiaolong Wang, Masashi Hamaya
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimentation demonstrates EDT s ability to bridge the performance gap between DT-based and Q Learning-based approaches. In particular, the EDT outperforms Q Learning-based methods in a multi-task regime on the D4RL locomotion benchmark and Atari games. |
| Researcher Affiliation | Collaboration | Yueh-Hua Wu12 Xiaolong Wang1 Masashi Hamaya2 1UC San Diego 2OMRON SINIC X |
| Pseudocode | Yes | Algorithm 1 EDT optimal history length search |
| Open Source Code | No | We are committed to releasing our code. |
| Open Datasets | Yes | Extensive experimentation demonstrates EDT s ability to bridge the performance gap between DT-based and Q Learning-based approaches. In particular, the EDT outperforms Q Learning-based methods in a multi-task regime on the D4RL locomotion benchmark and Atari games. |
| Dataset Splits | No | The paper describes using D4RL medium and medium-replay datasets and mentions 'Mean of 5 random training initialization seeds, 100 evaluations each', but does not explicitly define specific train/validation/test dataset splits with percentages or sample counts for reproducibility. |
| Hardware Specification | Yes | GPU 3090 Ti V100 |
| Software Dependencies | No | The paper mentions 'Optimizer Adam W' and other high-level components, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or other libraries. |
| Experiment Setup | Yes | Table 6: Hper-parameters. We list all hyper-parameters for D4RL and Atari games below. Parameter Setting (D4RL) Setting (Atari) ... Maximum history length 20 30 inverse temperature (κ) 10 10 Expectile level 0.99 0.99 Batch size 256 256 Step size (δ) 2 2 |