reproducibilityindex.ai

Scaling Up Robust MDPs using Function Approximation

Authors: Aviv Tamar, Shie Mannor, Huan Xu

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work we employ a reinforcement learning approach to tackle this planning problem: we develop a robust approximate dynamic programming method based on a projected ﬁxed point equation to approximately solve large scale robust MDPs. We show that the proposed method provably succeeds under certain technical conditions, and demonstrate its effectiveness through simulation of an option pricing problem.
Researcher Affiliation	Academia	Aviv Tamar AVIVT@TX.TECHNION.AC.IL Electrical Engineering Department, The Technion Israel Institute of Technology, Haifa 32000, IsraelShie Mannor SHIE@EE.TECHNION.AC.IL Electrical Engineering Department, The Technion Israel Institute of Technology, Haifa 32000, IsraelHuan Xu MPEXUH@NUS.EDU.SG Mechanical Engineering Department, National University of Singapore, Singapore 117575, Singapore
Pseudocode	No	The paper describes algorithms using equations and text, but does not include formal pseudocode blocks or algorithms labeled as such.
Open Source Code	Yes	The Matlab code for these results is provided in the supplementary material.
Open Datasets	No	Our price ﬂuctuation model M follows a Bernoulli distribution (Cox et al., 1979), xt+1 = ( fuxt, w.p. p fdxt, w.p. 1 p , where the up and down factors, fu and fd, are constant. Our empirical evaluation proceeds as follows. In each experiment, we generate Ndata trajectories of length T from the true model M.
Dataset Splits	No	The paper describes generating Ndata, Nsim, and Ntest trajectories but does not specify distinct training, validation, and test splits with proportions or counts.
Hardware Specification	No	The paper does not provide any specific hardware details used for the experiments.
Software Dependencies	No	The paper mentions 'Matlab code' but does not specify any version numbers for Matlab or any other software dependencies.
Experiment Setup	Yes	The parameters for the experiments are provided in the supplementary material, and were chosen to balance the different factors in the problem. Most importantly, we chose γ = 0.98 and a large uncertainty set such that Assumption 2 is severely violated.