Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

Authors: He Yang, Dongyi Lv, Song Ma, Wei Xi, Jizhong Zhao

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments across multiple datasets demonstrate that SNEAKDOOR achieves a compelling balance among attack success rate, clean test accuracy, and stealthiness, substantially improving the invisibility of both the synthetic data and triggered samples while maintaining high attack efficacy. Extensive experiments across six datasets demonstrate that SNEAKDOOR consistently out68 performs existing methods in achieving a superior balance across ASR, CTA, and STE.
Researcher Affiliation Academia 1 School of Computer Science and Technology, Xi an Jiaotong University, Xi an, China 2 State Key Laboratory of Human-Machine Hybrid Augmented Intelligence, Xi an Jiaotong University EMAIL, EMAIL, EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology in detail using textual explanations and mathematical equations (e.g., equations 1-9), but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available at https://github.com/XJTU-AI-Lab/Sneak Door.
Open Datasets Yes We evaluate SNEAKDOOR across five standard datasets: FMNIST [14], 266 CIFAR-10 [15], SVHN [16], Tiny-Image Net [17], STL-10 [18], and Image Nette [19].
Dataset Splits Yes Each dataset is processed according to the standard dataset 268 condensation protocol, with 50 images per class used for condensation.
Hardware Specification Yes All experiments were conducted utilizing the NVIDIA Ge Force RTX 4090
Software Dependencies No The paper specifies hyperparameters for training (e.g., Optimizer SGD, Batch size, Learning rate in Tables 15-18) but does not list specific software dependencies with their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes We have provided the full set of optimization hyperparameters used for SNEAKDOOR on the STL10 1028 dataset across four condensation baselines: DM, DC, IDM, and DAM, including learning rates, 1029 number of epochs, batch sizes, etc. These details are listed in Tab.5 Tab.8, allowing replication of 1030 our experiments.