reproducibilityindex.ai

Universal Option Models

Authors: hengshuai yao, Csaba Szepesvari, Richard S. Sutton, Joseph Modayil, Shalabh Bhatnagar

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate our method in two domains. The ﬁrst domain is a real-time strategy game, where the controller must select the best game unit to accomplish a dynamically-speciﬁed task. The second domain is article recommendation, where each user query deﬁnes a new reward function and an article s relevance is the expected return from following a policy that follows the citations between articles. Our experiments show that UOMs are substantially more efﬁcient than previously known methods for evaluating option returns and policies over options.
Researcher Affiliation	Academia	Hengshuai Yao, Csaba Szepesv ari, Rich Sutton, Joseph Modayil Department of Computing Science University of Alberta Edmonton, AB, Canada, T6H 4M5 hengshua,szepesva,sutton,jmodayil@cs.ualberta.ca Shalabh Bhatnagar Department of Computer Science and Automation Indian Institute of Science Bangalore-560012, India shalabh@csa.iisc.ernet.in
Pseudocode	No	The paper describes algorithms and presents update rules mathematically (e.g., 'U ok k+1 = U ok k + ηok k δk+1 φ(sk)'), but it does not include a formally labeled 'Pseudocode' or 'Algorithm' block with structured steps.
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for its methodology is publicly available.
Open Datasets	No	The paper mentions using 'a collection from DBLP that has about 1.5 million articles', but it does not provide concrete access information (link, DOI, or a full citation for the dataset itself with authors and year for direct retrieval).
Dataset Splits	No	The paper mentions using '3000 trajectories' and averaging results 'over 30 runs', but it does not specify explicit train/validation/test dataset splits with percentages, sample counts, or references to predefined splits for reproducibility.
Hardware Specification	No	The paper mentions running experiments on 'a modern PC with an Intel 1.7GHz processor and 8GB RAM'. While it provides some detail, 'Intel 1.7GHz processor' is not a specific model number (e.g., Core i7-xxxx) required for reproducibility.
Software Dependencies	No	The paper states that the implementation was done 'in a MATLAB implementation'. However, it does not provide a specific version number for MATLAB or any other software libraries or dependencies used.
Experiment Setup	Yes	The discount factor was 0.9. Features were a lookup table over the 11 11 grid. For all algorithms, only one step of planning was applied per action selection. The planning step-size for each algorithm was chosen from 0.001,0.01,0.1,1.0. Only the best one was reported for an algorithm. All data reported were averaged over 30 runs.