reproducibilityindex.ai

Model-Free Robust Average-Reward Reinforcement Learning

Authors: Yue Wang, Alvaro Velasquez, George K. Atia, Ashley Prater-Bennette, Shaofeng Zou

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	6. Experiments We numerically verify our previous convergence results and demonstrate the robustness of our algorithms. Additional experiments can be found in Appendix G.
Researcher Affiliation	Collaboration	1University at Buffalo 2University of Colorado Boulder 3University of Central Florida 4 Air Force Research Laboratory.
Pseudocode	Yes	Algorithm 1 Robust RVI TD
Open Source Code	No	The paper does not contain any explicit statements or links indicating the availability of open-source code for the described methodology.
Open Datasets	No	The paper uses descriptions of problem environments (Garnet problem, Frozen-Lake environment, Recycling Robot, Inventory Control Problem) and describes how data (transition kernels) are generated or environment parameters are set, but does not provide concrete access information (links, DOIs, specific citations to public datasets with authors/year) for a pre-existing, publicly available dataset used for training.
Dataset Splits	No	The paper describes experimental runs and repetitions (e.g., '30 times') but does not specify any training, validation, or test dataset splits (e.g., exact percentages or sample counts).
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory, or cloud instances) used for running the experiments.
Software Dependencies	No	The paper mentions 'Open AI Gym' but does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	We set δ = 0.4, αn = 0.01, f(V ) = \|S\|\|A\| . and We set δ = 0.4 and implement our algorithms and vanilla Q-learning under the nominal environment (α = β = 0.5) with stepsize 0.01.