Decision Transformer under Random Frame Dropping
Authors: Kaizhe Hu, Ray Chen Zheng, Yang Gao, Huazhe Xu
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that De Fog outperforms strong baselines under severe frame drop rates like 90%, while maintaining similar returns under non-frame-dropping conditions in the regular Mu Jo Co control benchmarks and the Atari environments. We evaluate our method on continuous and discrete control tasks in Mu Jo Co and Atari game environments. |
| Researcher Affiliation | Academia | Kaizhe Hu , Ray Chen Zheng Tsinghua University, Shanghai Qi Zhi Institute hkz22@mails.tsinghua.edu.cn Yang Gao, Huazhe Xu Tsinghua Universtiy, Shanghai AI Lab, Shanghai Qi Zhi Institute |
| Pseudocode | Yes | A ALGORITHM DETAILS The overall algorithm of De Fog could be summarized as in Algorithm 1; for the hyperparameters we use, please refer to Appendix B.2. Algorithm 1 Decision Transformer under Random Frame Dropping (De Fog) |
| Open Source Code | No | The paper does not provide an explicit statement about releasing its own source code or a direct link to a code repository for the De Fog method. It mentions leveraging 'the implementation of Takuma Seno (2021)' for baselines, but not for their proposed method. |
| Open Datasets | Yes | In each of the three Mu Jo Co environments, we use D4RL (Fu et al., 2020) which contains offline datasets of three different levels: expert, medium, and medium-replay. While in the three Atari environments, we follow the Decision Transformer to train on an average sampled dataset from a DQN agent s replay buffer (Agarwal et al., 2020). |
| Dataset Splits | No | The paper mentions 'Total training steps' and 'Finetune training steps' and evaluation 'in test time' but does not specify the use of a distinct validation dataset split for hyperparameter tuning or early stopping during training. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments, such as GPU models, CPU specifications, or memory. |
| Software Dependencies | No | The paper mentions environments and datasets like 'gym Mu Jo Co' and 'D4RL', and refers to libraries for baselines like 'd3rlpy'. However, it does not specify version numbers for any software dependencies like Python, PyTorch, TensorFlow, or specific libraries. |
| Experiment Setup | Yes | B.2 HYPERPARAMETER SETTINGS For the gym Mu Jo Co environment, we use the same model architecture as the Online Decision Transformer (Zheng et al., 2022). While the Online Decision Transformer uses different training parameters for each environment, we keep most of the training parameters the same among different environments. Table 2: Common Parameters for Gym Mu Jo Co (a) Architecture Parameters (b) Training Parameters. Table 3: Dataset Specific Parameters for Gym Mu Jo Co. |