Combinatorial Partial Monitoring Game with Linear Feedback and Its Applications

Authors: Tian Lin, Bruno Abrahao, Robert Kleinberg, John Lui, Wei Chen

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultaneously addresses limited feedback, infinite outcome space of the environment and exponentially large action space of the player. We present the Global Confidence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and finite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves O(T 2 3 log T) distribution-independent regret and O(log T) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log |X| rather than |X|, where X is the action space. GCB isolates offline optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowdsourcing application leading to both an efficient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback.
Researcher Affiliation Collaboration Tian Lin LINT10@MAILS.TSINGHUA.EDU.CN Bruno Abrahao ABRAHAO@CS.CORNELL.EDU Robert Kleinberg RDK@CS.CORNELL.EDU John C.S. Lui CSLUI@CSE.CUHK.EDU.HK Wei Chen WEIC@MICROSOFT.COM Tsinghua University, Beijing, China Cornell University, Ithaca, NY 14850, USA The Chinese University of Hong Kong, Shatin, NT, Hong Kong Microsoft Research, Beijing, China
Pseudocode Yes Algorithm 1 GCB Require: σ, α, f X (t); x X, r(x, ), Mx.
Open Source Code No The paper does not provide any statement or link indicating that source code for the described methodology is publicly available.
Open Datasets No The paper describes a "crowdsourcing application" as an example to illustrate the applicability of their model, but it does not mention using a specific, publicly available dataset for training or evaluation.
Dataset Splits No The paper is theoretical and does not describe empirical experiments with data, thus it does not specify any training/validation/test dataset splits.
Hardware Specification No The paper is theoretical and focuses on algorithm design and analysis; it does not describe any computational experiments or provide details on the hardware used.
Software Dependencies No The paper focuses on theoretical algorithm design and analysis. It does not specify any software dependencies with version numbers for reproducing experiments.
Experiment Setup No The paper is theoretical and proposes a new model and algorithm. It does not include specific experimental setup details, hyperparameters, or training configurations, as no empirical experiments are described.