Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Virus Infection Attack on LLMs: Your Poisoning Can Spread "VIA" Synthetic Data

Authors: Zi Liang, Qingqing Ye, Xuan Liu, Yanyun Wang, Jianliang Xu, Haibo Hu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments on both data poisoning and backdoor attacks show that VIA significantly increases the presence of poisoning content in synthetic data and correspondingly raises the attack success rate (ASR) on downstream models to levels comparable to those observed in the poisoned upstream models.
Researcher Affiliation Academia Zi Liang1 Qingqing Ye1 Xuan Liu2 Yanyun Wang3 Jianliang Xu4 Haibo Hu1,5 1: The Hong Kong Polytechnic University 2: University of California, San Diego 3: The Hong Kong University of Science and Technology (Guangzhou) 4: Hong Kong Baptist University 5: Poly U Research Centre for Privacy and Security Technologies in Future Smart Systems EMAIL, EMAIL EMAIL, EMAIL, EMAIL
Pseudocode No The paper describes the methodology using formalizations and derivations (Section 3.1, Appendix A.1, A.2) but does not include any explicitly labeled pseudocode or algorithm blocks with structured steps.
Open Source Code Yes Our source code is available at: https://github.com/liangzid/Virus Infection Attack.
Open Datasets Yes For these experiments, we use Tulu-3 [Zhou et al., 2023], a general-purpose SFT dataset, as the base corpus for the sentiment steering and biased recommendation tasks. For the knowledge injection scenario, we employ Open O1-SFT [Xia et al., 2025], a reasoning-oriented SFT dataset suitable for evaluating mathematical factual consistency. ... All three scenarios are implemented by poisoning the Alpaca SFT dataset [Taori et al., 2023].
Dataset Splits No The poisoned models are trained using 5,000 and 4,000 samples drawn from the aforementioned datasets. ... During synthetic data generation, queries are sampled from the same SFT datasets (but from different subsets) to simulate our threat model.
Hardware Specification Yes All experiments are conducted on four Nvidia H100 GPUs.
Software Dependencies No The paper mentions using LLa MA-3 [Grattafiori et al., 2024] as the backbone model but does not specify software dependencies like Python, PyTorch, or CUDA with version numbers.
Experiment Setup Yes The poisoned models are trained using 5,000 and 4,000 samples drawn from the aforementioned datasets. Training is conducted for 3 epochs with a maximum of 15,000 steps, using a learning rate of 3 10 5. We set the poisoning rate as 2%. The sequence length is set to 2,000 to prevent truncation of most reasoning samples.