Planning with Abstract Learned Models While Learning Transferable Subtasks

Authors: John Winder, Stephanie Milani, Matthew Landen, Erebus Oh, Shane Parr, Shawn Squire, Marie desJardins, Cynthia Matuszek9992-10000

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental 4 Experimental Methodology
Researcher Affiliation Academia John Winder,1 Stephanie Milani,2 Matthew Landen,3 Erebus Oh,1 Shane Parr,4 Shawn Squire,1 Marie des Jardins,5 and Cynthia Matuszek1 1University of Maryland, Baltimore County, 2Carnegie Mellon University, 3Georgia Institute of Technology, 4University of Massachusetts Amherst, 5Simmons University
Pseudocode Yes Algorithm 1 Planning with Abstract Learned Models
Open Source Code No The paper does not provide any explicit statements about open-sourcing the code or links to a code repository.
Open Datasets Yes The Taxi domain (Dietterich 2000) is a common HRL problem... The Cleanup domain simulates a robot that tidies a house by putting blocks where they belong, similar to the game of Sokoban (Mac Glashan et al. 2015; Guez et al. 2019).
Dataset Splits No The paper does not explicitly provide specific percentages, sample counts, or citations to predefined splits for training, validation, and test datasets. It mentions using established domains and grounding to new, random target MDPs for trials.
Hardware Specification Yes Performed on i7-4790K CPU @ 4.00 GHz, 20GB of RAM.
Software Dependencies No The paper mentions using 'Value Iteration as the planner and R-MAX as the model-based RL algorithm' but does not specify version numbers for any software dependencies or libraries.
Experiment Setup No The paper describes the domains (Taxi, Cleanup) and the types of hierarchies used (expert, learned, amended), and mentions that Value Iteration and R-MAX were used as algorithms. However, it does not provide specific hyperparameter values (e.g., learning rate, batch size, number of epochs) or detailed training configurations.