reproducibilityindex.ai

Adversarially Robust Decision Transformer

Authors: Xiaohang Tang, Afonso Marques, Parameswaran Kamalaruban, Ilija Bogunovic

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we conduct experiments to examine the robustness of our algorithm, Adversarially Robust Decision Transformer (ARDT), in three settings: (i) Short-horizon sequential games, where the offline dataset has full coverage and the test-time adversary is optimal (Section 4.1), (ii) A long-horizon sequential game, Connect Four, where the offline dataset has only partial coverage and the distributional-shifted test-time adversary (Section 4.2), and (iii) The standard continuous Mujoco tasks in the adversarial setting and a population of test-time adversaries (Section 4.3).
Researcher Affiliation	Collaboration	Xiaohang Tang University College London xiaohang.tang.20@ucl.ac.uk Afonso Marques University College London afonso.marques.22@ucl.ac.uk Parameswaran Kamalaruban Featurespace kamal.parameswaran@featurespace.co.uk Ilija Bogunovic University College London i.bogunovic@ucl.ac.uk
Pseudocode	Yes	Algorithm 1 Adversarially Robust Decision Transformer (ARDT)
Open Source Code	Yes	We publish our datasets along with the codebase via https://github.com/xiaohangt/ardt. There is no data access restrictions.
Open Datasets	Yes	We publish our datasets along with the codebase via https://github.com/xiaohangt/ardt. There is no data access restrictions. The Mujoco data profiles are in Table 5 and 4. The Mu Jo Co data has 1000 number of trajectories, each with 1000 steps of interactions. The Connect Four datasets have also 106 number of steps of interaction in total, where each trajectory has a length at most 22.
Dataset Splits	No	The paper mentions "Number of training steps" and "Number of testing iterations" in Table 3, but does not explicitly describe a separate validation split or the percentage/number of samples used for validation.
Hardware Specification	Yes	We conduct experiments on GPUs: a Ge Force RTX 2080 Ti with memory 11GB, and a NVIDIA A100 with memory 80GB.
Software Dependencies	No	The paper states that its implementation is based on others and mentions environments like Mu Jo Co, but does not provide specific version numbers for software dependencies such as Python, PyTorch, or TensorFlow libraries.
Experiment Setup	Yes	Process Hyperparameters Values (Full coverage game/Connect Four/Mu Jo Co) ... Learning rate 0.0001 Weight decay 0.0001 Warm up steps 1000 Drop out 0.1 Batch size 128/128/512 Optimizer Adam W ... Expectile level 0.01