Temporal Robustness against Data poisoning

Authors: Wenxiao Wang, Soheil Feizi

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We present a benchmark with an evaluation protocol simulating continuous data collection and periodic deployments of updated models, thus enabling empirical evaluation of temporal robustness. Lastly, we develop and also empirically verify a baseline defense, namely temporal aggregation, offering provable temporal robustness and highlighting the potential of our temporal threat model for data poisoning.
Researcher Affiliation Academia Wenxiao Wang Department of Computer Science University of Maryland College Park, MD 20742 wwx@umd.edu Soheil Feizi Department of Computer Science University of Maryland College Park, MD 20742 sfeizi@umd.edu
Pseudocode No The paper defines "Temporal Aggregation" with a mathematical formula in Definition 4.1, but it does not present it as a structured pseudocode block or algorithm.
Open Source Code No The paper does not contain an explicit statement about the release of its source code or a link to a code repository for the methodology described.
Open Datasets Yes We use News Category Dataset [20, 21] as the base of our benchmark, which contains news headlines from 2012 to 2022 published on Huff Post (https://www.huffpost.com).Rishabh Misra. News category dataset. Co RR, abs/2209.11429, 2022. doi: 10.48550/ARXIV.2209.11429. URL https://doi.org/10.48550/ar Xiv.2209.11429.
Dataset Splits No The paper describes a temporal split for training and testing data, where models are trained on all previous months' data and tested on the current month's data ("when predicting the categories of samples from the i-th month, the model to be used will be the one trained from only samples of previous months (i.e. from the 0-th to the (i 1)-th month)"). However, it does not specify a separate validation dataset split.
Hardware Specification No The paper does not provide specific details about the hardware used for running experiments, such as GPU/CPU models or memory specifications.
Software Dependencies No The paper mentions using "pre-trained Ro BERTa [17]", "Adam W optimizer [18]", and "Py Torch [24]" but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup Yes We optimize a linear classification head over normalized Ro BERTa features with Adam W optimizer [18] for 50 epochs, using a learning rate of 1e-3 and a batch size of 256. To make the base learner deterministic, we set the random seeds explicitly and disable all nondeterministic features from Py Torch [24].