reproducibilityindex.ai

Reducing Dueling Bandits to Cardinal Bandits

Authors: Nir Ailon, Zohar Karnin, Thorsten Joachims

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	For Doubler and Multi SBM we prove regret upper bounds in both ﬁnite and inﬁnite settings, and conjecture about the performance of Sparring which empirically outperforms the other two as well as previous algorithms in our experiments. In addition, we provide the ﬁrst almost optimal regret bound in terms of second order terms, such as the differences between the values of the arms.
Researcher Affiliation	Collaboration	Nir Ailon NAILON@TECHNION.AC.IL Technion, Dept. of Computer Science, Haifa 32000, Israel Zohar Karnin ZKARNIN@YMAIL.COM Yahoo Labs, Haifa 31905, Israel Thorsten Joachims TJ@CS.CORNELL.EDU Cornell University, Dept. of Computer Science, Ithaca, NY 14850, USA
Pseudocode	Yes	Algorithm 1 UCB algorithm for MAB with \|X\| = K arms.; Algorithm 2 (Doubler): Reduction for ﬁnite and inﬁnite X with internal structure.; Algorithm 3 (Multi SBM): Reduction for unstructured ﬁnite X by using K SBMs in parallel.; Algorithm 4 (Sparring): Reduction to two SBMs.
Open Source Code	No	The paper does not provide any statement or link regarding the availability of its source code.
Open Datasets	Yes	For this comparison, we shall use the same matrix (ϵxy)x,y X as in (Yue & Joachims, 2011), which was empirically estimated from an operational search engine.
Dataset Splits	No	The paper describes a simulation-based experiment and does not mention explicit train/validation/test splits with percentages, sample counts, or predefined split citations.
Hardware Specification	No	The paper does not provide any specific hardware details such as GPU models, CPU types, or memory specifications used for running its experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers, such as programming languages or libraries, used for its experiments.
Experiment Setup	Yes	Henceforth, the set X of arms is {A, B, C, D, E, F}. We considered 5 choices of the expected value function µ( ) and 3 link functions... For each 15 combinations of arm values and link function we ran all 5 algorithms IF, BTMB, Multi SBM, Doubler, and Sparring with random inputs spanning a time horizon of up to 32000... averaged over 400 executions, with one standard deviation conﬁdence bars.