Online Learning with Gaussian Payoffs and Side Observations

Authors: Yifan Wu, András György, Csaba Szepesvari

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical For the first time in the literature, we provide non-asymptotic problem-dependent lower bounds on the regret of any algorithm, which recover existing asymptotic problem-dependent lower bounds and finite-time minimax lower bounds available in the literature. We also provide algorithms that achieve the problem-dependent lower bound (up to some universal constant factor) or the minimax lower bounds (up to logarithmic factors).
Researcher Affiliation Academia 1Dept. of Computing Science University of Alberta {ywu12,szepesva}@ualberta.ca 2Dept. of Electrical and Electronic Engineering Imperial College London a.gyorgy@imperial.ac.uk
Pseudocode Yes Algorithm 1 1: Inputs: Σ, α, β : N [0, ). 2: For t = 1, ..., K, observe each action i at least once by playing it such that t Sit. ... Algorithm 2 1: Inputs: Σ, δ. 2: Set t1 = 0, A1 = [K].
Open Source Code No The paper does not provide any links to open-source code or explicitly state that the code for the described methodology is publicly available.
Open Datasets No The paper is theoretical and does not involve the use of any datasets for training. Thus, no access information for a public dataset is provided.
Dataset Splits No The paper is theoretical and does not involve the use of datasets or specify any training/validation/test splits.
Hardware Specification No The paper is theoretical and does not describe any experimental setup that would require hardware specifications.
Software Dependencies No The paper is theoretical and does not mention any specific software dependencies with version numbers required for replication.
Experiment Setup No The paper is theoretical, focusing on mathematical bounds and algorithms, and does not describe an experimental setup with hyperparameters or training configurations for empirical evaluation.