Unimodal Thompson Sampling for Graph-Structured Arms

Authors: Stefano Paladino, Francesco Trov˜, Marcello Restelli, Nicola Gatti

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a thorough experimental evaluation of the performance of our and state of the art algorithms as the properties of the graph vary.
Researcher Affiliation Academia Stefano Paladino, Francesco Trov o, Marcello Restelli, Nicola Gatti Dipartimento di Elettronica, Informazione e Bioingegneria Politecnico di Milano, Milano, Italy
Pseudocode Yes Algorithm 1 UTS
Open Source Code No The paper does not provide any statement about releasing source code or a link to a code repository.
Open Datasets No The paper describes how graphs and rewards are generated (e.g., 'Erd os-R enyi graph is generated by connecting nodes randomly', 'Bernoulli rewards whose averages have a triangular shape') and refers to settings from other papers. However, it does not provide concrete access information (link, DOI, specific citation with authors/year) to a publicly available dataset used for training.
Dataset Splits No The paper describes experimental settings like 'horizon T' and '100 independent trials' but does not specify dataset splits (training, validation, test) for data partitioning. Since the data is simulated/generated, standard splits are not applicable or described in a way that allows reproduction of specific splits.
Hardware Specification No The paper does not specify any hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch, specific libraries and their versions) needed to replicate the experiment.
Experiment Setup Yes We consider graphs with K {5, 10, 20, 50, 100, 1000} and with probability p {1, 1 K , ℓ}. The optimal arm is chosen randomly among the existing arms and its reward is given by a Bernoulli distribution with expected value 0.9. The rewards of the suboptimal arms are given by Bernoulli distributions with expected value depending on their distance from the optimal one. More precisely, let d i be the shortest path from the i th arm to the optimal arm and let: d max = max i {1,...,K} d i be the maximum shortest path of the graph. The expected reward of the i th arm is: μi = 0.9 d i (0.9 0.1) / d max.