Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
SleeperNets: Universal Backdoor Poisoning Attacks Against Reinforcement Learning Agents
Authors: Ethan Rathbun, Christopher Amato, Alina Oprea
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our attack in 6 environments spanning multiple domains and demonstrate significant improvements in attack success over existing methods, while preserving benign episodic return. |
| Researcher Affiliation | Academia | Ethan Rathbun , Christopher Amato , Alina Oprea Khoury College of Computer Sciences, Northeastern University |
| Pseudocode | Yes | Algorithm 1 The Sleeper Nets Attack |
| Open Source Code | Yes | Code is attached to the paper submission and provided anonymously here. |
| Open Datasets | Yes | We evaluate each method on a suite of 6 diverse environments against agents trained using the cleanrl [10] implementation of PPO [31]. First, to replicate and validate the results of [4] and [14] we test all attacks on Atari Breakout and Qbert from the Atari gymnasium suite [1]. In our evaluation we found that these environments are highly susceptible to backdoor poisoning attacks, thus we extend and focus our study towards the following 4 environments: Car Racing from the Box2D gymnasium [1], Safety Car from Safety Gymnasium [12], Highway Merge from Highway Env [18], and Trading BTC from Gym Trading Env [27]. |
| Dataset Splits | No | The paper does not explicitly mention using a 'validation' dataset split for hyperparameter tuning or model selection in the context of data partitioning, though PPO is used which handles this differently. |
| Hardware Specification | Yes | Machines Used in Experimental Results Machine CPU GPU RAM Laptop i9-12900HX RTX A2000 32GB Desktop Threadripper PRO 5955WX RTX 4090 128GB Server Intel Xeon Silver 4114 None 128GB |
| Software Dependencies | No | The paper mentions 'cleanrl [10] implementation of PPO [31]' but does not specify version numbers for cleanrl, PPO, or other general software dependencies like Python, PyTorch/TensorFlow, or CUDA. |
| Experiment Setup | Yes | In Table 4 we summarize the trigger pattern, poisoning budget, target action, and values of clow and chigh used in each environment. |