Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Contextual Bandits with Similarity Information

Authors: Aleksandrs Slivkins

JMLR 2014 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We present algorithms that are based on adaptive partitions, and take advantage of benign payoffs and context arrivals without sacrificing the worst-case performance. The central idea is to maintain a finer partition in high-payoffregions of the similarity space and in popular regions of the context space. Our results apply to several other settings, e.g., MAB with constrained temporal change (Slivkins and Upfal, 2008) and sleeping bandits (Kleinberg et al., 2008a). Our key contribution is an algorithm which maintains a partition of the metric space and adaptively refines this partition over time. Due to this adaptive partition technique, one can take advantage of benign problem instances without sacrificing the worst-case performance; here benign-ness refers to both expected payoffs and context arrivals. We essentially resolve the setting where expected payofffrom every given context-arm pair either does not change over time, or changes slowly. In particular, we obtain nearly matching lower bounds (for time-invariant expected payoffs and for an important special case of slow change).
Researcher Affiliation Industry Aleksandrs Slivkins EMAIL Microsoft Research New York City 641 6th Ave, 7th floor New York, NY 10011 USA
Pseudocode Yes Algorithm 1 Contextual zooming algorithm. Algorithm 2 Algorithm Contextual Bandit.
Open Source Code No The paper describes algorithms (Contextual zooming, Contextual Bandit) and their theoretical properties. It does not provide any links to source code repositories or explicit statements about releasing code for the described methodology.
Open Datasets No The paper is theoretical, focusing on algorithm design, analysis, and proofs for contextual bandit problems. It does not conduct experiments that would require the use of specific datasets, and therefore, it does not mention or provide access information for any open datasets.
Dataset Splits No The paper is theoretical and does not involve empirical experiments with datasets. Therefore, there is no mention of dataset splits (e.g., training, validation, test splits) in the text.
Hardware Specification No The paper is theoretical, focusing on the design and analysis of algorithms for contextual bandit problems. It does not report on any empirical experiments or implementations that would require specific hardware, and thus no hardware specifications are mentioned.
Software Dependencies No The paper describes algorithms and theoretical concepts without detailing any empirical implementations. Consequently, it does not list specific software dependencies or their version numbers that would be needed to replicate experiments.
Experiment Setup No The paper is theoretical, focusing on algorithm design and their mathematical properties. It does not include any experimental results, and therefore, it does not provide details on experimental setup, hyperparameters, or training configurations.