reproducibilityindex.ai

Combinatorial Partial Monitoring Game with Linear Feedback and Its Applications

Authors: Tian Lin, Bruno Abrahao, Robert Kleinberg, John Lui, Wei Chen

ICML 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this paper, we propose the model of combinatorial partial monitoring games with linear feedback, a model which simultaneously addresses limited feedback, inﬁnite outcome space of the environment and exponentially large action space of the player. We present the Global Conﬁdence Bound (GCB) algorithm, which integrates ideas from both combinatorial multi-armed bandits and ﬁnite partial monitoring games to handle all the above issues. GCB only requires feedback on a small set of actions and achieves O(T 2 3 log T) distribution-independent regret and O(log T) distribution-dependent regret (the latter assuming unique optimal action), where T is the total time steps played. Moreover, the regret bounds only depend linearly on log \|X\| rather than \|X\|, where X is the action space. GCB isolates ofﬂine optimization tasks from online learning and avoids explicit enumeration of all actions in the online learning part. We demonstrate that our model and algorithm can be applied to a crowdsourcing application leading to both an efﬁcient learning algorithm and low regret, and argue that they can be applied to a wide range of combinatorial applications constrained with limited feedback.
Researcher Affiliation	Collaboration	Tian Lin LINT10@MAILS.TSINGHUA.EDU.CN Bruno Abrahao ABRAHAO@CS.CORNELL.EDU Robert Kleinberg RDK@CS.CORNELL.EDU John C.S. Lui CSLUI@CSE.CUHK.EDU.HK Wei Chen WEIC@MICROSOFT.COM Tsinghua University, Beijing, China Cornell University, Ithaca, NY 14850, USA The Chinese University of Hong Kong, Shatin, NT, Hong Kong Microsoft Research, Beijing, China
Pseudocode	Yes	Algorithm 1 GCB Require: σ, α, f X (t); x X, r(x, ), Mx.
Open Source Code	No	The paper does not provide any statement or link indicating that source code for the described methodology is publicly available.
Open Datasets	No	The paper describes a "crowdsourcing application" as an example to illustrate the applicability of their model, but it does not mention using a specific, publicly available dataset for training or evaluation.
Dataset Splits	No	The paper is theoretical and does not describe empirical experiments with data, thus it does not specify any training/validation/test dataset splits.
Hardware Specification	No	The paper is theoretical and focuses on algorithm design and analysis; it does not describe any computational experiments or provide details on the hardware used.
Software Dependencies	No	The paper focuses on theoretical algorithm design and analysis. It does not specify any software dependencies with version numbers for reproducing experiments.
Experiment Setup	No	The paper is theoretical and proposes a new model and algorithm. It does not include specific experimental setup details, hyperparameters, or training configurations, as no empirical experiments are described.