reproducibilityindex.ai

Factored MCTS for Large Scale Stochastic Planning

Authors: Hao Cui, Roni Khardon, Alan Fern, Prasad Tadepalli

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	An extensive experimental evaluation demonstrates that the new algorithms provide signiﬁcant improvement over the state of the art when solving large problems in a number of challenge benchmark domains. First, we present an experimental study on current challenge problems and moderately larger problems exposing the above mentioned phenomenon. We ran experiments on the Tufts UIT research cluster (each node includes Intel Xeon X5675@ 3GHz CPU, and 24GB memory).
Researcher Affiliation	Academia	Hao Cui and Roni Khardon Department of Computer Science Tufts University Medford, MA 02155, USA Alan Fern and Prasad Tadepalli School of Electrical Engineering and Computer Science Oregon State University Corvallis, OR 97331, USA
Pseudocode	No	The paper describes the algorithms (ARollout, AMCTS) in detail with textual explanations, but it does not provide any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statement about releasing its own source code or a link to a repository containing the implementation of the described methodology. It mentions using third-party software like RDDL and PROST, but these are not the authors' code for their proposed methods.
Open Datasets	Yes	Four domains are from IPPC2011, the elevators domain (where people randomly arrive at each ﬂoor and go to either top or bottom ﬂoor), the sysadmin domain (where failures of computers depend on their neighbors and one can reboot a number of computers at each time step), the crossing trafﬁc domain (where a robot tries to get to the other side of a river with randomly appearing ﬂowing obstacles), and the trafﬁc domain (where one controls trafﬁc lights to enable trafﬁc ﬂow). The IPPC provided 10 instances for each domain. We similarly generated 20 instances for one-dir-elevators domain. 1http://concurrent-value-iteration.googlecode.com/svnhistory/r133/trunk/rddl/elevators_mdp.rddl
Dataset Splits	No	The paper refers to problem "instances" for evaluation but does not specify any train/validation/test dataset splits, as it's not based on a traditional supervised learning dataset where such splits are common. It's about algorithms for stochastic planning problems.
Hardware Specification	Yes	We ran experiments on the Tufts UIT research cluster (each node includes Intel Xeon X5675@ 3GHz CPU, and 24GB memory).
Software Dependencies	No	The paper mentions using "RDDL software" and the "PROST system" but does not provide specific version numbers for these or any other software dependencies, which would be necessary for reproducibility.
Experiment Setup	Yes	For PROST we use the IPPC2011 setting except that the allocated time per step is explicitly set. For our algorithms, as mentioned above, the number of samples per action a for the estimate of Qπ(s, a) is determined dynamically. The parameter n that controls the number of action samples when aggregating over actions is set to min{10, 0.6 \|A\|}. The simulation depth in ARollout and the depth of the MCTS tree are set to half of the horizon used in evaluation time. We use ϵ = 0.5 for the ϵ-greedy action choice in all algorithms.