reproducibilityindex.ai

Identification of the Generalized Condorcet Winner in Multi-dueling Bandits

Authors: Björn Haddenhorst, Viktor Bengs, Eyke Hüllermeier

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In a numerical study, we show that DKWT empirically outperforms current state-of-the-art algorithms, even in the special case of dueling bandits or under a Plackett-Luce assumption on the feedback mechanism.In the following, we present experimental results on the performance of our GCW identiﬁcation solution.
Researcher Affiliation	Academia	Björn Haddenhorst Department of Computer Science Paderborn University Paderborn, Germany bjoernha@mail.upb.de Viktor Bengs Institute of Informatics University of Munich (LMU) Munich, Germany viktor.bengs@lmu.de Eyke Hüllermeier Institute of Informatics University of Munich (LMU) Munich, Germany eyke@ifi.lmu.de
Pseudocode	Yes	Algorithm 1 DKW mode-identiﬁcation with abstention Input: γ (0, 1), h (0, 1), access to iid samples Xt Cat(p) 1: T 8 ln(4/γ)/h2 2: Observe samples X1, . . . , XT 3: Calculate ˆp T = (ˆp T 1 , . . . , ˆp T k ) as ˆp T i := 1 T PT t=1 1{Xt=i}, i [k] 4: Choose i mode(ˆp T ) 5: if ˆp T i > maxj =i ˆp T j + h then return i 6: else return UNSURE and Algorithm 2 DKW mode-identiﬁcation Solution to Pk,γ k ( 0) and Algorithm 3 DVORETZKY KIEFER WOLFOWITZ TOURNAMENT Solution to Pm,γ k ( GCW 0)
Open Source Code	Yes	Our implementation is provided at https://github.com/bjoernhad/GCWidentification.
Open Datasets	No	The experiments are conducted on probability models generated with specific parameters (e.g., 'P PM 5 k(PL) with underlying PL-parameter θ') or sampled uniformly at random, rather than using a distinct public dataset with explicit access information.
Dataset Splits	No	The paper's empirical evaluation focuses on the sample complexity and accuracy of bandit algorithms based on observations drawn from a probabilistic model, rather than using a fixed dataset with explicit training, validation, or test splits. The term 'samples' refers to the observations generated by the model during the algorithm's execution, not pre-split data portions.
Hardware Specification	Yes	All experiments were conducted on a machine with an Intel Core i7-4700MQ Processor, executing all experiments (including those in the supplemental material) with only one CPU core in use took less than 72 hours.
Software Dependencies	No	The paper states that its implementation is provided at a GitHub link but does not explicitly list any software dependencies with specific version numbers within the text.
Experiment Setup	Yes	We restrict ourselves in the main paper to DKWT, which is our solution of the most general problem Pm,γ k ( h GCW 0). Throughout all experiments, if not speciﬁed differently in the pseudocode, every choice of an element within a speciﬁc set made by DKWT is performed uniformly at random. and Table 3 shows the results of both algorithms when started on an instance P PM 5 k(PL) with underlying PL-parameter θ = (1, 0.8, 0.6, 0.4, 0.2) and γ = 0.05, for different values of k.