Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Stealthy Yet Effective: Distribution-Preserving Backdoor Attacks on Graph Classification

Authors: Xiaobao Wang, Ruoxiao Sun, Yujun Zhang, Bingdao Feng, Dongxiao He, Luzhi Wang, Di Jin

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on real-world datasets validate that DPSBA achieves a superior balance between effectiveness and detectability compared to state-of-the-art baselines.
Researcher Affiliation Academia 1College of Intelligence and Computing, Tianjin University, Tianjin, China 2Guangdong Laboratory of Artificial Intelligence and Digital Economy (SZ), Shenzhen, China 3College of Artificial Intelligence, Dalian Maritime University, Dalian, China EMAIL EMAIL, EMAIL
Pseudocode Yes The DPSBA algorithm is detailed in Algorithm 1.
Open Source Code Yes The code is available at https://github.com/TheCoderOfs/DPSBA.
Open Datasets Yes We evaluate DPSBA on four real-world graph classification datasets from the TUDataset benchmark [21]: PROTEINS_full [22] ( protein graphs for function prediction), AIDS [23] (molecular graphs related to AIDS research), FRANKENSTEIN [24] (a compound property dataset combining BURS and MNIST features), and ENZYMES [25, 26] (a 6-class biomolecular classification task).
Dataset Splits Yes Following GTA [14], we split each dataset into 50% training and 50% test sets, with 5% of the training data poisoned.
Hardware Specification Yes All experiments are conducted on a machine equipped with a 14-core Intel i7-12700H CPU, an NVIDIA Ge Force RTX 3060 GPU (12 GB), and Windows 11 (version 23H2).
Software Dependencies No The paper mentions 'Windows 11 (version 23H2)' as the operating system, but does not specify any programming languages, libraries, or frameworks with version numbers that are critical for reproducibility.
Experiment Setup Yes Both the topology and feature generators are trained for 20 epochs per stage over 3 iterations with a learning rate of 0.001, using early stopping [27]. For baselines, we use the best hyperparameters reported in their original papers. ... To control the anomaly level of injected triggers, DPSBA employs adversarial training with two loss weights: α for structural anomalies and β for feature anomalies. This experiment investigates how varying these weights affects the trade-off between attack effectiveness and stealth. Specifically, we vary α and β from 0.1 to 100 in the joint loss formulations (Equations 7 and 8) and observe the corresponding changes in attack success rate (ASR) and anomaly detectability (AUC).