reproducibilityindex.ai

Between Imitation and Intention Learning

Authors: James MacGlashan, Michael L. Littman

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We present empirical results on multiple domains that demonstrate that performing IRL with a small, but non-zero, receding planning horizon greatly decreases the computational cost of planning while maintaining superior generalization performance compared to imitation learning.
Researcher Affiliation	Academia	James Mac Glashan Brown University james macglashan@brown.edu Michael L. Littman Brown University mlittman@cs.brown.edu
Pseudocode	Yes	Algorithm 1 Compute Value(s, h)
Open Source Code	Yes	We have also made RHIRL publically available as part of BURLAP1, an open source reinforcement learning and planning library. 1http://burlap.cs.brown.edu/
Open Datasets	No	The paper uses expert demonstrations generated by the authors for the navigation, mountain car, and lunar lander domains. It does not provide concrete access (link, DOI, specific citation to a public dataset) to these training demonstrations.
Dataset Splits	No	The paper mentions testing generalization performance on novel states but does not explicitly specify training/validation/test dataset splits or sample counts for validation data.
Hardware Specification	No	The paper mentions 'total training CPU time' but does not provide specific details about the hardware used, such as CPU models, GPU models, or memory.
Software Dependencies	No	The paper mentions using 'Weka s J48 classiﬁer' and 'Weka s logistic regression implementation' but does not specify version numbers for Weka or the classifiers.
Experiment Setup	Yes	RHIRL used 10 steps of gradient ascent. (for navigation and lunar lander) and RHIRL used 15 steps of gradient ascent (for mountain car). To facilitate generalization, the learned reward function is a linear combination of both task features and agent-space features.