reproducibilityindex.ai

A Unifying Theory of Thompson Sampling for Continuous Risk-Averse Bandits

Authors: Joel Q. L. Chang, Vincent Y. F. Tan6159-6166

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical simulations show that the regret bounds incurred by our algorithms are reasonably tight vis-a-vis algorithm-independent lower bounds. (...) Numerical Experiments We verify our theory via numerical experiments on ρ-NPTS for new risk measures that are linear combinations of existing ones.
Researcher Affiliation	Academia	Joel Q. L. Chang1, Vincent Y. F. Tan1, 2 1Department of Mathematics, National University of Singapore 2Department of Electrical and Computer Engineering, National University of Singapore
Pseudocode	Yes	Algorithm 1: ρ-MTS (...) Algorithm 2: ρ-NPTS
Open Source Code	Yes	The Java code to reproduce the plots in Figure 2 can be found at tinyurl.com/unify Rho Ts.
Open Datasets	No	The paper uses simulated data based on specified probability distributions (Beta(1, 3), Beta(3, 3), Beta(3, 1)). It does not use or provide access information for a pre-existing publicly available dataset.
Dataset Splits	No	The paper describes its simulation setup, including the number of arms, time steps, and experiments, but it does not specify explicit train/validation/test dataset splits. The experiments involve simulating bandit processes rather than using fixed datasets with defined splits.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. It only mentions 'Numerical Experiments'.
Software Dependencies	No	The paper mentions 'The Java code to reproduce the plots' but does not specify the version of Java or any other software libraries with their version numbers that are necessary for reproducibility.
Experiment Setup	Yes	We consider a 3-arm bandit instance (i.e., K = 3) with a horizon of n = 5, 000 time steps and over 50 experiments, where the arms 1, 2, 3 follow probability distributions Beta(1, 3), Beta(3, 3), Beta(3, 1) respectively. (...) Deﬁne the risk functionals ρ1 := MV0.5+CVa R0.95 and ρ2 := Prop0.7 + LB0.6 on (P(B) c , DL), where we set (γ, α, p, q) = (0.5, 0.95, 0.7, 0.6) as the parameters for the mean-variance, CVa R, Proportional risk hazard, and Lookback components respectively (see Table 1).