reproducibilityindex.ai

Generalized Mean Estimation in Monte-Carlo Tree Search

Authors: Tuan Dam, Pascal Klink, Carlo D'Eramo, Jan Peters, Joni Pajarinen

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We theoretically analyze our method providing guarantees of convergence to the optimum. Finally, we empirically demonstrate the effectiveness of our method in well-known MDP and POMDP benchmarks, showing signiﬁcant improvement in performance and convergence speed w.r.t. state of the art algorithms.
Researcher Affiliation	Academia	1Department of Computer Science, Technische Universit at Darmstadt, Germany 2Robot Learning Group, Max Planck Institute for Intelligent Systems,T ubingen, Germany 3Computing Sciences, Tampere University, Finland
Pseudocode	Yes	Algorithm 1 Power-UCT
Open Source Code	No	No explicit statement about providing open-source code for their methodology or a link to a repository.
Open Datasets	Yes	For MDPs, we consider the well-known Frozen Lake problem as implemented in Open AI Gym [Brockman et al., 2016].
Dataset Splits	No	The paper does not specify explicit training, validation, and test dataset splits by percentage or sample count. It mentions 'evaluation runs' but not data partitioning for model training and selection.
Hardware Specification	No	The paper does not provide any specific hardware details such as CPU/GPU models, memory, or computational resources used for running experiments.
Software Dependencies	No	The paper does not list specific software dependencies with version numbers.
Experiment Setup	Yes	For MENTS we find the best combination of the two hyperparameters by grid search. In MDP tasks, we find the UCT exploration constant using grid search. For Power-UCT, we find the p-value by increasing it until performance starts to decrease.