Contextual Combinatorial Cascading Bandits

Authors: Shuai Li, Baoxiang Wang, Shengyu Zhang, Wei Chen

ICML 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on synthetic and real datasets demonstrate the advantage of involving contextual information and position discounts. We evaluate our algorithm, C3-UCB, in a synthetic setting and two real applications.
Researcher Affiliation Collaboration Shuai Li SHUAILI@CSE.CUHK.EDU.HK The Chinese University of Hong Kong, Hong Kong Baoxiang Wang BXWANG@CSE.CUHK.EDU.HK The Chinese University of Hong Kong, Hong Kong Shengyu Zhang SYZHANG@CSE.CUHK.EDU.HK The Chinese University of Hong Kong, Hong Kong Wei Chen WEIC@MICROSOFT.COM Microsoft Research, Beijing, China
Pseudocode Yes Algorithm 1 C3-UCB
Open Source Code No The paper does not provide any links to open-source code or explicitly state that code is available.
Open Datasets Yes Movie Lens (Lam & Herlocker, 2015); Rocket Fuel dataset (Spring et al., 2004)
Dataset Splits No The paper does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification No The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes In experiments, we set the position discounts γk to be γk 1 for some γ. The problem is a contextual cascading bandit with L = 200 items and K = 4, where at each time t the agent recommends K items to the user. At first, we randomly choose a θ Rd 1 with θ 2 = 1 and let θ = ( θ 2). Then at each time t, we randomly assign x t,a Rd 1 with x t,a 2 = 1 to arm a and use xt,a = (x t,a, 1) to be the contextual information for arm a.