Online Learning with Gaussian Payoffs and Side Observations
Authors: Yifan Wu, András György, Csaba Szepesvari
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | For the first time in the literature, we provide non-asymptotic problem-dependent lower bounds on the regret of any algorithm, which recover existing asymptotic problem-dependent lower bounds and finite-time minimax lower bounds available in the literature. We also provide algorithms that achieve the problem-dependent lower bound (up to some universal constant factor) or the minimax lower bounds (up to logarithmic factors). |
| Researcher Affiliation | Academia | 1Dept. of Computing Science University of Alberta {ywu12,szepesva}@ualberta.ca 2Dept. of Electrical and Electronic Engineering Imperial College London a.gyorgy@imperial.ac.uk |
| Pseudocode | Yes | Algorithm 1 1: Inputs: Σ, α, β : N [0, ). 2: For t = 1, ..., K, observe each action i at least once by playing it such that t Sit. ... Algorithm 2 1: Inputs: Σ, δ. 2: Set t1 = 0, A1 = [K]. |
| Open Source Code | No | The paper does not provide any links to open-source code or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | No | The paper is theoretical and does not involve the use of any datasets for training. Thus, no access information for a public dataset is provided. |
| Dataset Splits | No | The paper is theoretical and does not involve the use of datasets or specify any training/validation/test splits. |
| Hardware Specification | No | The paper is theoretical and does not describe any experimental setup that would require hardware specifications. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies with version numbers required for replication. |
| Experiment Setup | No | The paper is theoretical, focusing on mathematical bounds and algorithms, and does not describe an experimental setup with hyperparameters or training configurations for empirical evaluation. |