Stochastic Online Greedy Learning with Semi-bandit Feedbacks
Authors: Tian Lin, Jian Li, Wei Chen
NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We first propose the greedy regret and ϵ-quasi greedy regret as learning metrics comparing with the performance of offline greedy algorithm. We then propose two online greedy learning algorithms with semi-bandit feedbacks... Both algorithms achieve O(log T) problem-dependent regret bound... We further show that the bound is tight... Due to the space constraint, the analysis of algorithms, applications and empirical evaluation of the lower bound are moved to the supplementary material. |
| Researcher Affiliation | Collaboration | Tian Lin Tsinghua University Beijing, China lintian06@gmail.com Jian Li Tsinghua University Beijing, China lapordge@gmail.com Wei Chen Microsoft Research Beijing, China weic@microsoft.com |
| Pseudocode | Yes | Algorithm 1 OG Subroutine 2 UCB(A, ˆX( ), N( ), t) Subroutine 3 LUCBϵ,δ(A, ˆX( ), N( ), t) Algorithm 4 OG-LUCB-R (i.e., OG-LUCB with Restart) |
| Open Source Code | No | The paper does not provide any statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | No | The paper does not mention any specific dataset used for training or provide access information for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific details on training, validation, or test dataset splits. |
| Hardware Specification | No | The paper does not explicitly describe the hardware specifications used to run its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers. |
| Experiment Setup | No | The paper does not contain specific experimental setup details, such as concrete hyperparameter values or training configurations. |