Adversarial Attacks on Adversarial Bandits

Authors: Yuzhe Ma, Zhijin Zhou

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show that our proposed attack algorithms are efficient on both vanilla and a robust version of Exp3 algorithm Yang et al. (2020).
Researcher Affiliation Industry Yuzhe Ma Microsoft Azure AI yuzhema@microsoft.com Zhijin Zhou Amazon zhijin@amazon.com
Pseudocode No The paper references 'Exp3 algorithm (see algorithm 1 in the appendix)', but the appendix is not included in the provided text. Therefore, pseudocode is not present in the main paper.
Open Source Code No The paper does not provide any statement about releasing open-source code or a link to a code repository for its methodology.
Open Datasets No The paper describes a synthetic bandit problem setup with 'K = 2 arms' and custom loss functions. It does not refer to any publicly available or open dataset used for training, nor does it provide access information for any dataset.
Dataset Splits No The paper describes synthetic experimental scenarios with varying total horizon T, but it does not specify explicit training, validation, or test dataset splits. The experiments are based on simulations over a total number of rounds T.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory, cloud instances) used to run the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes In our first example, we consider a bandit problem with K = 2 arms, a1 and a2. The loss function is 8t, Lt(a1) = 0.5 and Lt(a2) = 0. ... In the first experiment, we let the total horizon be T = 103, 104, 105 and 106. ... For the other victim Exp Rb, we consider different levels of attack budget Φ. ... we consider Φ = T 0.5, T 0.7 and T 0.9. ... Next we apply the general attack (6) to verify that (6) can recover the results of Theorem 4.3 in the easy attack scenario. We fix = 0.25 in (6). ... In our second example, we consider a bandit problem with K = 2 arms and the loss function is 8t, Lt(a1) = 1 and Lt(a2) = 0. ... We let = 0.1, 0.25 and 0.4.