Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Information Directed Sampling for Stochastic Bandits With Graph Feedback
Authors: Fang Liu, Swapna Buccapatnam, Ness Shroff
AAAI 2018 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Finally, using numerical evaluations, we demonstrate that our proposed IDS policies outperform existing approaches, including adaptions of upper confidence bound, ϵ-greedy and Exp3 algorithms. |
| Researcher Affiliation | Collaboration | Fang Liu The Ohio State University Columbus, Ohio 43210 EMAIL Swapna Buccapatnam AT&T Labs Research Middletown, NJ 07748 EMAIL Ness Shroff The Ohio State University Columbus, Ohio 43710 EMAIL |
| Pseudocode | Yes | Algorithm 1 Meta-algorithm for Information Directed Sampling with Graph Feedback |
| Open Source Code | No | The paper does not provide any concrete access (links, explicit statements) to open-source code for the described methodology. |
| Open Datasets | No | Section 7 'Numerical Results' describes a simulated Beta-Bernoulli bandit problem where reward values are drawn from a Beta(1,1) distribution. This is an internally generated simulation environment, not a publicly available dataset with concrete access information (link, citation). |
| Dataset Splits | No | The paper mentions running experiments for a given time horizon T and averaging results over 1000 trials, but it does not specify explicit training, validation, or test dataset splits in the conventional sense, as the data is generated through simulation rather than being a fixed dataset. |
| Hardware Specification | No | The paper does not provide any specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions implementing 'Algorithm 2 in Russo and Van Roy (2014)' and using 'Beta-Bernoulli bandits', but does not list any specific software dependencies with version numbers (e.g., Python, PyTorch, specific libraries). |
| Experiment Setup | Yes | In the experiment, we set K = 5 and T = 1000. All the regret results are averaged over 1000 trials. |