reproducibilityindex.ai

Time-Constrained Robust MDPs

Authors: Adil Zouitine, David Bertoin, Pierre Clavier, Matthieu Geist, Emmanuel Rachelson

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We propose three distinct algorithms, each using varying levels of environmental information, and evaluate them extensively on continuous control benchmarks. Our results demonstrate that these algorithms yield an efficient tradeoff between performance and robustness, outperforming traditional deep robust RL methods in time-constrained environments while preserving robustness in classical benchmarks.
Researcher Affiliation	Collaboration	Adil Zouitine 1,2, David Bertoin 1,3,6, Pierre Clavier 4,5 Matthieu Geist7, Emmanuel Rachelson2,6 1IRT Saint-Exupéry, 2ISAE-SUPAERO, Université de Toulouse,3IMT, INSA Toulouse 4École Polytechnique, CMAP, 5Inria Paris, He KA 6ANITI, 7Cohere {adil.zouitine, david.bertoin}@irt-saintexupery.com, pierre.clavier@polytechnique.edu
Pseudocode	Yes	Algorithm 1 Time-constrained robust training
Open Source Code	No	The paper states in the NeurIPS checklist that code is provided for reproduction, but the main body or appendices do not contain an explicit statement of code release for the authors' specific methodology or a direct link to a repository.
Open Datasets	Yes	Experimental validation was conducted in continuous control scenarios using the Mu Jo Co simulation environments [5].
Dataset Splits	No	The paper does not provide specific training/validation/test dataset splits. As a reinforcement learning paper, it conducts evaluation but does not use traditional data splits common in supervised learning.
Hardware Specification	Yes	All experiments were run on a desktop machine (Intel i9, 10th generation processor, 64GB RAM) with a single NVIDIA RTX 4090 GPU.
Software Dependencies	No	The paper mentions using the official M2TD3 [18] implementation and the TD3 implementation from the Clean RL library [32], but it does not provide specific version numbers for these software components or other dependencies like Python or PyTorch.
Experiment Setup	Yes	Table 5: Hyperparameters for the M2TD3 Agent and Table 6: Hyperparameters for the TD3 Agent provide specific hyperparameter values such as Batch Size, Learning Rate, and Gamma.