reproducibilityindex.ai

ASAP-UCT: Abstraction of State-Action Pairs in UCT

Authors: Ankit Anand, Aditya Grover, Mausam, Parag Singla

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental evaluation on several benchmark domains shows up to 26% improvement in the quality of policies obtained over existing algorithms. Our experiments aim to study the comparative performance of various abstraction approaches in UCT.
Researcher Affiliation	Academia	Indian Institute of Technology, Delhi New Delhi, India
Pseudocode	Yes	Algorithm 1 Computing Abstract Search Tree, Algorithm 2 Abstraction of States, Algorithm 3 Abstraction of State-Action Pairs, Algorithm 4 ASAP-UCT Algorithm
Open Source Code	Yes	2. We implement and release1 ASAP-UCT, an algorithm that exploits SAP abstractions in a UCT framework. 1Available at https://github.com/dair-iitd/asap-uct
Open Datasets	Yes	We experiment on three diverse domains, Sailing Wind [Kocsis and Szepesv ari, 2006; Bonet and Geffner, 2012], Game of Life [Sanner and Yoon, 2011], and Navigation [Sanner and Yoon, 2011]. Our empirical results are reported on two IPPC-2011 instances, of dimensions 3 3 and 4 4 (#states: 29, 216).
Dataset Splits	No	The paper describes running experiments for a certain number of 'trials' and using 'execution horizon', but it does not specify explicit train/validation/test dataset splits as typically done for supervised learning tasks.
Hardware Specification	Yes	All our experiments are performed on a Quad-Core Intel i-5 processor.
Software Dependencies	No	The paper mentions using specific algorithms and borrowing base code for UCT, but it does not provide a list of software dependencies with specific version numbers (e.g., programming language versions, library versions).
Experiment Setup	Yes	The exploration constant K for the UCB equation is set as the negative of the magnitude of current Q value at the node (following [Bonet and Geffner, 2012]). we used l = 1 in our experiments. We use 100 as the execution horizon for Sailing wind and Navigation domains and 40 for Game of Life domain.