reproducibilityindex.ai

First- and Second-Order Bounds for Adversarial Linear Contextual Bandits

Authors: Julia Olkhovskaya, Jack Mayo, Tim van Erven, Gergely Neu, Chen-Yu Wei

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	Our main contribution is to replace the worst-case rate by adaptive bounds. Specifically, we obtain a bound of O( VT ) in terms of a quadratic measure of variance VT for the losses of the algorithm, and a bound of O(p L T ), where L T is the cumulative loss incurred by the optimal policy. Such bounds in terms of L T or VT are generally referred to as first-order and second-order bounds, respectively, and have been extensively studied in the bandit literature.
Researcher Affiliation	Academia	Julia Olkhovskaya1 Jack Mayo2 Tim van Erven2 Gergely Neu3 Chen-Yu Wei4 1Department of Intelligent Systems, Delft University of Technology, Delft, The Netherlands 2Korteweg-de Vries Institute for Mathematics, University of Amsterdam, Amsterdam, The Netherlands 3AI group, DTIC, Universitat Pompeu Fabra, Barcelona, Spain 4MIT Institute for Data, Systems, and Society, Massachusetts Institute of Technology, Cambridge, MA, USA
Pseudocode	Yes	Algorithm 1 CONTEXTEW Parameters: γ > 0, η1 . . . ηT > 0, m1, . . . , m T For t = 1, . . . , T: 1. Observe Xt. 2. Repeat: Pick Qt from the distribution pt defined in (8), until a=1 Qt,a Xt 2 Σ 1 t,a d Kγ2, (6) where Σt,a is defined in (9).
Open Source Code	No	The paper does not contain any statement about making its own code open source, nor does it provide a link to a repository.
Open Datasets	No	The paper does not refer to any specific dataset or provide access information for one. It mentions 'the d-dimensional contexts are drawn from a fixed known distribution' but this is a theoretical assumption, not a specific dataset.
Dataset Splits	No	The paper is a theoretical work focusing on deriving bounds and analyzing algorithms. It does not conduct experiments on a dataset, and therefore does not mention any training, validation, or test splits.
Hardware Specification	No	The paper is theoretical and does not report on running experiments. While it discusses computational efficiency, it does not provide any specific hardware details used for running experiments.
Software Dependencies	No	The paper is theoretical and focuses on mathematical derivations and algorithm analysis rather than practical implementation. It does not list specific software dependencies with version numbers needed to replicate any experiments.
Experiment Setup	No	The paper introduces an algorithm (CONTEXTEW) with parameters like γ, ηt, and mt, but these are part of the algorithm's theoretical definition rather than specific values used in an empirical 'experimental setup' with hardware, software, or training configurations.