reproducibilityindex.ai

Manipulating a Learning Defender and Ways to Counteract

Authors: Jiarui Gan, Qingyu Guo, Long Tran-Thanh, Bo An, Michael Wooldridge

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Empirical evaluation shows that our approaches can improve the defender s utility signiﬁcantly as compared to the situation when attacker manipulation is ignored.
Researcher Affiliation	Academia	Jiarui Gan University of Oxford Oxford, UK jiarui.gan@cs.ox.ac.uk Qingyu Guo Nanyang Technological University Singapore qguo005@e.ntu.edu.sg Long Tran-Thanh University of Southampton Southampton, UK l.tran-thanh@soton.ac.uk Bo An Nanyang Technological University Singapore boan@ntu.edu.sg Michael Wooldridge University of Oxford Oxford, UK mjw@cs.ox.ac.uk
Pseudocode	Yes	Algorithm 1: Decide if there exists a policy π such that Eo P(π) ξ.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described.
Open Datasets	No	In our evaluations, attacker types are randomly generated using the covariance model [15], with a parameter ρ [0, 1] to control the closeness of the generated game to a zero-sum game.
Dataset Splits	No	The paper describes how attacker types are generated but does not specify dataset split information such as percentages or counts for training, validation, or testing.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers.
Experiment Setup	Yes	In our evaluations, attacker types are randomly generated using the covariance model [15], with a parameter ρ [0, 1] to control the closeness of the generated game to a zero-sum game. That is, we shift each payoff parameter x towards the corresponding one y of a zero-sum attacker type, letting x (1 ρ) x + ρ y. All results shown are the average of at least 50 runs. Figure 1 (a) and (b), shows the variance of the Eo P with respect to ρ and the size of the game. Except for the QR policy with ϕ = 10, performance of all other policies is very close to each other, though there is a discernable gap between the optimal policy and the SSE policy. In (a), results are obtained with other parameters set to λ = 100, m = 10, and n = 50; and in (b) with m = n/5, ρ = 0.5, and λ = 100.