reproducibilityindex.ai

When Are Linear Stochastic Bandits Attackable?

Authors: Huazheng Wang, Haifeng Xu, Hongning Wang

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Numerical experiments further validate the effectiveness and cost-efﬁciency of the proposed attack method. and 5. Experiments We use simulation-based experiments to validate the effectiveness and cost-efﬁciency of our proposed attack methods.
Researcher Affiliation	Academia	Huazheng Wang * 1 Haifeng Xu 2 Hongning Wang 2 *Most work was done while with University of Virginia. 1Princeton University 2 University of Virginia.
Pseudocode	Yes	Algorithm 1 Two-stage Null Space Attack and Algorithm 2 Oracle Null Space Attack
Open Source Code	No	The paper does not contain any explicit statement or link indicating that the source code for the described methodology is publicly available.
Open Datasets	No	In our simulations, we generate a size-k arm pool A, in which each arm a is associated with a context vector xa Rd. Each dimension of xa is drawn from a set of zero-mean Gaussian distributions with variances sampled from a uniform distribution U(0, 1). Each xa is then normalized to xa 2 = 1. The bandit model parameter θ is sampled from N(0, 1) and normalized to θ 2 = 1.
Dataset Splits	No	The paper describes simulation-based experiments where data is generated for each run. It does not provide explicit dataset split percentages, sample counts, or references to predefined train/validation/test splits of a larger dataset.
Hardware Specification	No	The paper mentions running 'simulation-based experiments' but does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used for these simulations.
Software Dependencies	No	The paper refers to standard algorithms like Lin UCB and Robust Phase Elimination but does not specify any programming languages, libraries, or software environments with version numbers used for the implementation or experiments.
Experiment Setup	Yes	In our simulations, we generate a size-k arm pool A, in which each arm a is associated with a context vector xa Rd. Each dimension of xa is drawn from a set of zero-mean Gaussian distributions with variances sampled from a uniform distribution U(0, 1). Each xa is then normalized to xa 2 = 1. The bandit model parameter θ is sampled from N(0, 1) and normalized to θ 2 = 1. We set d to 10, the standard derivation σ of Gaussian noise ηt and ηt to 0.1, and the arm pool size k to 30 in our simulations. We run the experiment for T = 10, 000 iterations. and Specifically, when T1 = T 1/2, the strategy gives the minimum attack cost in the order of O(T 3/4), and the non-target arms are pulled at most O(T 3/4) rounds. and For any large enough T1, the attack strategy in Algorithm 1 will correctly assert the attackability with high probability. Moreover, when the environment is attackable, with probability at least 1 2δ the attack strategy will fool Robust Ph E to pull non-target arms at most T log(T/δ) + d T log(T) log(1/δ)/ T1 + T 2 1 /ϵ rounds. and Specifically, setting T1 = T 2/5 gives the minimum attack cost order O(T 4/5) and the non-target arms are pulled at most O(T 4/5) rounds.