reproducibilityindex.ai

Optimal Algorithms for Stochastic Contextual Preference Bandits

Authors: Aadirupa Saha

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This section gives empirical performances of our algorithms (Alg. 1 and 3) and compare them with some existing preference learning algorithms.
Researcher Affiliation	Industry	Microsoft Research, New York, US; aasa@microsoft.com.
Pseudocode	Yes	Algorithm 1 Maximum-Informative-Pair (Max In P)
Open Source Code	No	The paper does not provide any explicit statement or link regarding open-source code for the described methodology.
Open Datasets	No	The paper describes synthetic problem instances and functions for g() (Quadratic, Six-Hump Camel, Gold Stein) which are generated for experiments, but does not provide specific access information (links, DOIs, formal citations) to a publicly available or open dataset.
Dataset Splits	No	The paper does not provide specific dataset split information (percentages, sample counts, citations to predefined splits) for training, validation, or testing.
Hardware Specification	No	No specific hardware details (like GPU/CPU models or memory) used for running the experiments are provided.
Software Dependencies	No	The paper mentions using techniques (e.g., GP fitting, kernelized self-sparring) and refers to existing works ([29], [37]) but does not provide specific version numbers for any software, libraries, or frameworks used in the experiments.
Experiment Setup	Yes	For this experiment we ﬁx d = 10 and K = 50. Fig. 2 shows both our algorithms Max In P and Sta D always outperform the rest... We use thsese 3 functions as g( ): 1. Quadratic, 2. Six-Hump Camel and 3. Gold Stein. For all cases, we ﬁx d = 3 and K = 50.