reproducibilityindex.ai

Planning with Abstract Learned Models While Learning Transferable Subtasks

Authors: John Winder, Stephanie Milani, Matthew Landen, Erebus Oh, Shane Parr, Shawn Squire, Marie desJardins, Cynthia Matuszek9992-10000

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	4 Experimental Methodology
Researcher Affiliation	Academia	John Winder,1 Stephanie Milani,2 Matthew Landen,3 Erebus Oh,1 Shane Parr,4 Shawn Squire,1 Marie des Jardins,5 and Cynthia Matuszek1 1University of Maryland, Baltimore County, 2Carnegie Mellon University, 3Georgia Institute of Technology, 4University of Massachusetts Amherst, 5Simmons University
Pseudocode	Yes	Algorithm 1 Planning with Abstract Learned Models
Open Source Code	No	The paper does not provide any explicit statements about open-sourcing the code or links to a code repository.
Open Datasets	Yes	The Taxi domain (Dietterich 2000) is a common HRL problem... The Cleanup domain simulates a robot that tidies a house by putting blocks where they belong, similar to the game of Sokoban (Mac Glashan et al. 2015; Guez et al. 2019).
Dataset Splits	No	The paper does not explicitly provide specific percentages, sample counts, or citations to predefined splits for training, validation, and test datasets. It mentions using established domains and grounding to new, random target MDPs for trials.
Hardware Specification	Yes	Performed on i7-4790K CPU @ 4.00 GHz, 20GB of RAM.
Software Dependencies	No	The paper mentions using 'Value Iteration as the planner and R-MAX as the model-based RL algorithm' but does not specify version numbers for any software dependencies or libraries.
Experiment Setup	No	The paper describes the domains (Taxi, Cleanup) and the types of hierarchies used (expert, learned, amended), and mentions that Value Iteration and R-MAX were used as algorithms. However, it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations.