Temporal Robustness against Data poisoning
Authors: Wenxiao Wang, Soheil Feizi
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present a benchmark with an evaluation protocol simulating continuous data collection and periodic deployments of updated models, thus enabling empirical evaluation of temporal robustness. Lastly, we develop and also empirically verify a baseline defense, namely temporal aggregation, offering provable temporal robustness and highlighting the potential of our temporal threat model for data poisoning. |
| Researcher Affiliation | Academia | Wenxiao Wang Department of Computer Science University of Maryland College Park, MD 20742 wwx@umd.edu Soheil Feizi Department of Computer Science University of Maryland College Park, MD 20742 sfeizi@umd.edu |
| Pseudocode | No | The paper defines "Temporal Aggregation" with a mathematical formula in Definition 4.1, but it does not present it as a structured pseudocode block or algorithm. |
| Open Source Code | No | The paper does not contain an explicit statement about the release of its source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We use News Category Dataset [20, 21] as the base of our benchmark, which contains news headlines from 2012 to 2022 published on Huff Post (https://www.huffpost.com).Rishabh Misra. News category dataset. Co RR, abs/2209.11429, 2022. doi: 10.48550/ARXIV.2209.11429. URL https://doi.org/10.48550/ar Xiv.2209.11429. |
| Dataset Splits | No | The paper describes a temporal split for training and testing data, where models are trained on all previous months' data and tested on the current month's data ("when predicting the categories of samples from the i-th month, the model to be used will be the one trained from only samples of previous months (i.e. from the 0-th to the (i 1)-th month)"). However, it does not specify a separate validation dataset split. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as GPU/CPU models or memory specifications. |
| Software Dependencies | No | The paper mentions using "pre-trained Ro BERTa [17]", "Adam W optimizer [18]", and "Py Torch [24]" but does not provide specific version numbers for these software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | We optimize a linear classification head over normalized Ro BERTa features with Adam W optimizer [18] for 50 epochs, using a learning rate of 1e-3 and a batch size of 256. To make the base learner deterministic, we set the random seeds explicitly and disable all nondeterministic features from Py Torch [24]. |