Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Information Directed Sampling for Stochastic Bandits With Graph Feedback

Authors: Fang Liu, Swapna Buccapatnam, Ness Shroff

AAAI 2018 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, using numerical evaluations, we demonstrate that our proposed IDS policies outperform existing approaches, including adaptions of upper confidence bound, ϵ-greedy and Exp3 algorithms.
Researcher Affiliation Collaboration Fang Liu The Ohio State University Columbus, Ohio 43210 EMAIL Swapna Buccapatnam AT&T Labs Research Middletown, NJ 07748 EMAIL Ness Shroff The Ohio State University Columbus, Ohio 43710 EMAIL
Pseudocode Yes Algorithm 1 Meta-algorithm for Information Directed Sampling with Graph Feedback
Open Source Code No The paper does not provide any concrete access (links, explicit statements) to open-source code for the described methodology.
Open Datasets No Section 7 'Numerical Results' describes a simulated Beta-Bernoulli bandit problem where reward values are drawn from a Beta(1,1) distribution. This is an internally generated simulation environment, not a publicly available dataset with concrete access information (link, citation).
Dataset Splits No The paper mentions running experiments for a given time horizon T and averaging results over 1000 trials, but it does not specify explicit training, validation, or test dataset splits in the conventional sense, as the data is generated through simulation rather than being a fixed dataset.
Hardware Specification No The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions implementing 'Algorithm 2 in Russo and Van Roy (2014)' and using 'Beta-Bernoulli bandits', but does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, specific libraries).
Experiment Setup Yes In the experiment, we set K = 5 and T = 1000. All the regret results are averaged over 1000 trials.