reproducibilityindex.ai

Learning Predictive State Representations via Monte-Carlo Tree Search

Authors: Yunlong Liu, Hexing Zhu, Yifeng Zeng, Zongxiong Dai

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on several domains including one extremely large domain and the experimental results show the effectiveness of our approach.
Researcher Affiliation	Academia	1Department of Automation, Xiamen University, China 2School of Computing, Teesside University, UK
Pseudocode	Yes	Algorithm 1 shows our proposed algorithm in detail in pseudocode, where s(v) denotes the corresponding state of node v, which is the set of actions(tests) starting from the root node to node v. Algorithm 1: The discovery using MCTS algorithm
Open Source Code	No	The paper does not provide a link to open-source code for their method or explicitly state that their code is being released.
Open Datasets	Yes	We evaluated the proposed technique in three domains of different size, namely Cheese Maze, Hallway2 [Cassandra, 1999] and Poc Man [Silver and Veness, 2010; Hamilton et al., 2014].
Dataset Splits	No	No explicit validation dataset split information (e.g., percentages, counts, cross-validation setup) was found. The paper mentions training and testing sequence lengths but not a distinct validation split.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used.
Experiment Setup	Yes	To accelerate the search process and as the order of actions in the action sequence for our method has no effect on the ﬁnal result, for Cheese Maze, the number of legal actions at each node was set to 10, the candidate actions were limited to the possible length 1 and 2 tests; for Hallway2 and Poc Man, the number of legal actions at each node was set to 20, the candidate actions were limited to the possible length 1 tests. The exploration constant c was set to 0.001 and a state is considered to be terminal when the search reaches a certain depth.