Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Multi-level Certified Defense Against Poisoning Attacks in Offline Reinforcement Learning

Authors: Shijie Liu, Andrew Cullen, Paul Montague, Sarah Erfani, Benjamin Rubinstein

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Empirical evaluations demonstrate that our approach ensures the performance drops to no more than 50% with up to 7% of the training data poisoned, significantly improving over the 0.008% in prior work (Wu et al., 2022), while producing certified radii that is 5 times larger as well. This highlights the potential of our framework to enhance safety and reliability in offline RL.
Researcher Affiliation Academia 1School of Computing and Information Systems, University of Melbourne, Melbourne, Australia 2Defence Science and Technology Group, Adelaide, Australia EMAIL
Pseudocode Yes Algorithm 1 Sampled Gaussian Mechanism (SGM) for a Model M using Dataset D. Algorithm 2 Model Training with DP-FEDAVG.
Open Source Code No The paper does not provide any specific links to source code repositories or explicit statements about code release for the methodology described.
Open Datasets Yes We conducted evaluations using Farama Gymnasium (Towers et al., 2023) discrete Atari games Freeway and Breakout, as well as the continuous action space Mujoco game Half Cheetah. We also employed the D4RL (Fu et al., 2020) dataset and the Opacus (Yousefpour et al., 2021) DP framework.
Dataset Splits No Our offline RL datasets consist of 2 million transitions for each game, with corresponding trajectory counts of 976 for Freeway, 3,648 for Breakout, and 2,000 for Half Cheetah. However, the paper does not specify how these datasets are split into training, validation, or test sets.
Hardware Specification Yes implemented with Convolutional Neural Networks (CNN) in PyTorch on a NVIDIA 80GB A100 GPU.
Software Dependencies No The paper mentions 'PyTorch' and 'Opacus (Yousefpour et al., 2021) DP framework' but does not specify their version numbers. It also lists algorithms like DQN, IQL, and C51 without specific software versions.
Experiment Setup Yes In all experiments, the sample rates q in the DP training algorithms were adjusted to achieve a batch size of 32, with varying noise multipliers σ as detailed in the results. Uncertainties were estimated within a confidence interval suitable for δ = 0.001. For each game, the number of policy instances p, as described in Section 4.1, is set to 50. The number of estimations of expected cumulative reward m is set to 500, with 10 estimations per policy instance, as detailed in Section 4.2.