reproducibilityindex.ai

SHINE: Shielding Backdoors in Deep Reinforcement Learning

Authors: Zhuowen Yuan, Wenbo Guo, Jinyuan Jia, Bo Li, Dawn Song

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We further conduct extensive experiments that evaluate SHINE against three mainstream DRL backdoor attacks in various benchmark RL environments. Our results show that SHINE significantly outperforms existing defenses in mitigating these backdoor attacks.
Researcher Affiliation	Academia	1University of Illinois Urbana-Champaign 2University of California, Santa Barbara 3Pennsylvania State University 4University of Chicago 5University of California Berkeley.
Pseudocode	Yes	Algorithm 1 shows our final backdoor shielding algorithm.
Open Source Code	No	The paper does not include an unambiguous statement that the code for the described methodology is publicly available, nor does it provide a direct link to a code repository.
Open Datasets	Yes	We follow Trojdrl and select three Atari games from the Open AI Gym (Brockman et al., 2016) environment pool Pong, Breakout, and Space Invaders.
Dataset Splits	No	The paper describes collecting trajectories and retraining agents, but it does not specify explicit training/validation/test dataset splits with percentages or counts for reproduction.
Hardware Specification	Yes	On average, the trigger detection stage of SHINE takes 12 hours, and the retraining stage takes 5 hours on a single NVIDIA RTX A6000 GPU.
Software Dependencies	No	The paper mentions using 'pytorch', 'gpytorch', and 'stable-baseline' but does not specify their version numbers.
Experiment Setup	Yes	The key hyper-parameters introduced by our method are the weight of the elastic-net regularization term in the feature-level explanation λ and the strength of the KL constraint in the policy retraining ϵ. We set λ to 1e-4 and ϵ to 0.01.