Observation-Free Attacks on Stochastic Bandits

Authors: Yinglun Xu, Bhuvesh Kumar, Jacob D. Abernethy

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, to intuitively illustrate the behavior of algorithms under corruption by our adversary algorithm, we run simulations attacking UCB, ϵ-greedy and Thompson Sampling algorithm. Each algorithm is tested under the same artificial instance with 2 arms, with means µ1 = 0.9 and µ2 = 0.8. The arm 1 is the optimal arm and we set arm 2 as the target arm for the adversary. We set T = 50000 and the corresponding parameters (C1, C2) for each of the algorithm is listed in Table 1. In Figure 1, we plot some key statistics about the arms as a function of the iterations that can help us understand the behaviour of the algorithms under the attacks. In Figure 2, we plot the number of times the optimal arm is pulled is chosen till round t, i.e. nt i with the iteration t on the x axis in both the settings.
Researcher Affiliation Academia Yinglun Xu University of Illinois at Urbana-Champaign yinglun6@illinois.edu Kumar Bhuvesh Georgia Institute of Technology bhuvesh@gatech.edu Jacob Abernethy Georgia Institute of Technology prof@gatech.edu
Pseudocode Yes Algorithm 1: Observation-Free Attack; Algorithm 2: Bandit learning with data poisoning attack
Open Source Code No The paper does not provide an explicit statement or link for open-source code availability for the described methodology.
Open Datasets No The paper states it uses an 'artificial instance with 2 arms, with means µ1 = 0.9 and µ2 = 0.8' for its simulations, but provides no access information (link, DOI, repository, or formal citation) for a publicly available or open dataset.
Dataset Splits No The paper conducts simulations using an artificial instance and does not describe traditional training, validation, or test dataset splits. It sets total rounds T=50000 and parameters C1 and C2 for the attack phases, but these are not dataset splits.
Hardware Specification No The paper does not provide any specific hardware details such as GPU or CPU models used for running the experiments.
Software Dependencies No The paper does not provide specific software dependencies, such as library names with version numbers, used to implement or run the experiments.
Experiment Setup Yes Each algorithm is tested under the same artificial instance with 2 arms, with means µ1 = 0.9 and µ2 = 0.8. The arm 1 is the optimal arm and we set arm 2 as the target arm for the adversary. We set T = 50000 and the corresponding parameters (C1, C2) for each of the algorithm is listed in Table 1. Table 1: Corruption level parameters for different algorithms