reproducibilityindex.ai

Online Robust Reinforcement Learning with Model Uncertainty

Authors: Yue Wang, Shaofeng Zou

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our numerical experiments further demonstrate the robustness of our algorithms.
Researcher Affiliation	Academia	Yue Wang University at Buffalo Buffalo, NY 14228 ywang294@buffalo.edu Shaofeng Zou University at Buffalo Buffalo, NY 14228 szou3@buffalo.edu
Pseudocode	Yes	Algorithm 1 Robust Q-Learning; Algorithm 2 Robust TDC with Linear Function Approximation
Open Source Code	No	The paper does not provide any links to open-source code or explicitly state that code is made available.
Open Datasets	Yes	We use Open AI gym framework [Brockman et al., 2016], and consider two different problems: Frozen lake and Cart-Pole.
Dataset Splits	No	The paper describes training on a 'perturbed MDP' and testing on an 'unperturbed MDP' but does not specify a separate validation split or its methodology.
Hardware Specification	No	The paper does not specify any hardware used for the experiments (e.g., CPU, GPU models).
Software Dependencies	No	The paper mentions 'Open AI gym framework' but does not provide version numbers for this or any other software components.
Experiment Setup	Yes	The behavior policy for all the experiments below is set to be a uniform distribution over the action space given any state, i.e., πb(a\|s) = 1 \|A\| for any s S and a A. We take the average over 30 trajectories. We set α = 0.2 and γ = 0.9.