Thompson Sampling for Stochastic Bandits with Graph Feedback

Authors: Aristide Tossou, Christos Dimitrakakis, Devdatt Dubhashi

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We illustrate its performance through extensive experimental results on real and simulated networks with graph feedback. More specifically, we tested our algorithms on power law, planted partitions and Erd os R enyi graphs, as well as on graphs derived from Facebook and Flixster data.
Researcher Affiliation Academia Aristide C. Y. Tossou Computer Science and Engineering Chalmers University of Technology Gothenburg, Sweden aristide@chalmers.se Christos Dimitrakakis University of Lille, France Chalmers University of Technology Harvard University, USA christos.dimitrakakis@gmail.com Devdatt Dubhashi Computer Science and Engineering Chalmers University of Technology Gothenburg, Sweden dubhashi@chalmers.se
Pseudocode Yes Algorithm 1 TS-N (Bernoulli case) and Algorithm 2 TS-Max N
Open Source Code No Our source code and data sets will be made available on an open hosting website.
Open Datasets No The paper mentions using "graphs derived from Facebook and Flixster data" and "data used in (Caron et al. 2012)" but does not provide specific access information (links, DOIs, or formal citations to publicly available datasets) for these real-world graphs. Simulated graph types are mentioned but are generated, not sourced.
Dataset Splits No The paper does not provide specific training, validation, or test dataset splits, percentages, or sample counts. It describes how experimental trials were conducted and results aggregated (median-of-means estimator) but not data partitioning.
Hardware Specification No No specific hardware details (e.g., CPU, GPU models, memory, or cloud instance types) used for running the experiments are mentioned in the paper.
Software Dependencies No The paper does not specify any software names with version numbers that would be necessary to replicate the experiments (e.g., Python, PyTorch, TensorFlow versions, or specific libraries).
Experiment Setup Yes For each arm i, set Si = 1 and Fi = 1 for all round t = 1, , T do For each arm i, sample θi from the Beta distribution Beta(Si, Fi)... In our experiments, we always set that to a Beta(1,1) prior for all rewards. For all of our experiments, we performed 210 independent trials... In our synthetic problems, unless otherwise stated, the rewards are drawn from a Bernoulli distribution whose mean is generated uniformly randomly in [0.45, 0.55] except for the optimal arm whose mean is generated randomly in [0.55, 0.6]. The number of nodes in the graph is 500. We tested with a sparse graph of 2500 edges and also with a dense graph of 62625 edges. ϵ-greedy-D and ϵ-greedy-LP have the hyper-parameters c and d, which control the amount of exploration. We found that its performance is highly sensitive to their choice. In our experiments, we find the optimal values for these parameters by performing a separate grid search for each problem, and only reporting the best results.