reproducibilityindex.ai

Monte-Carlo Tree Search by Best Arm Identification

Authors: Emilie Kaufmann, Wouter M. Koolen

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show experimentally that our algorithms outperform existing elimination-based algorithms and match previous special-purpose methods for depth-two trees.
Researcher Affiliation	Academia	Emilie Kaufmann CNRS & Univ. Lille, UMR 9189 (CRISt AL), Inria Seque L Lille, France emilie.kaufmann@univ-lille1.fr Wouter M. Koolen Centrum Wiskunde & Informatica, Science Park 123, 1098 XG Amsterdam, The Netherlands wmkoolen@cwi.nl
Pseudocode	Yes	Figure 2: The BAI-MCTS architecture
Open Source Code	No	The paper does not provide any statement or link indicating that the source code for their methodology is publicly available.
Open Datasets	Yes	We evaluate on the depth-two benchmark tree from [10], a new depth-three tree and the random tree ensemble from [22].
Dataset Splits	Yes	All counts are averages over 10K repetitions with ϵ = 0 and δ = 0.1 9. To replicate the experiment from [22], we supply all algorithms with δ = 0.1 and ϵ = 0.01. On 10K full 10-ary trees of depth 3 with Bernoulli leaf parameters drawn uniformly at random from [0,1] the average numbers of samples are: LUCB-MCTS 141811, UGap E-MCTS 142953 and Find Top Winner 2254560.
Hardware Specification	No	The paper does not provide any specific details about the hardware used for running experiments (e.g., CPU/GPU models, memory, or cloud instances).
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., library names like PyTorch or TensorFlow with their versions).
Experiment Setup	Yes	To replicate the experiment from [22], we supply all algorithms with δ = 0.1 and ϵ = 0.01. For comparing with [10] we run all algorithms with ϵ = 0 and δ = 0.1 L (undoing the conservative union bound over leaves. This excessive choice, which might even exceed one, does not cause a problem, as the algorithms depend on δ L = 0.1). For our BAI-MCTS algorithms and for M-LUCB we use the exploration rate β(s,δ) = ln L /δ + ln(ln(s) + 1) (a stylized version of Lemma 2 that works well in practice), and we use the KL reﬁnement of the conﬁdence intervals (1).