Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

On Collective Robustness of Bagging Against Data Poisoning

Authors: Ruoxin Chen, Zenan Li, Jie Li, Junchi Yan, Chentao Wu

ICML 2022 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate our two techniques empirically and quantitatively on four datasets: collective certification and hash bagging. Results show: i) collective certification can yield a much stronger robustness certificate. ii) Hash bagging effectively improves vanilla bagging on the certified robustness.
Researcher Affiliation Academia 1Department of Computer Science and Engineering and Mo E Key Lab of Artificial Intelligence, Shanghai Jiao Tong University, Shanghai, China. Jie Li and Junchi Yan are also with Shanghai AI Laboratory, Shanghai, China.
Pseudocode Yes Algorithm 1: Certify the collective robustness for our proposed hash bagging. ... Algorithm 2: Train the sub-classifiers.
Open Source Code Yes Our code is available at https: //github.com/Emiyalzn/ICML22-CRB.
Open Datasets Yes We evaluate hash bagging and collective certification on two classic machine learning datasets: Bank (Moro et al., 2014), Electricity (Harries & Wales, 1999), and two image classification datasets: FMNIST (Xiao et al., 2017), CIFAR-10 (Krizhevsky et al., 2009). ... Bank: https://archive.ics.uci.edu/ml/datasets/ Bank+Marketing. Electricity: https://datahub.io/machine-learning/ electricity. Fashion-MNIST: https://github.com/zalandoresearch/ fashion-mnist. CIFAR-10: https://www.cs.toronto.edu/ kriz/cifar. html.
Dataset Splits Yes The detailed experimental setups are shown in Table 2. ... Bank (35,211 Trainset, 10,000 Testset) ... Electricity (35,312 Trainset, 10,000 Testset) ... FMNIST (60,000 Trainset, 10,000 Testset) ... CIFAR-10 (50,000 Trainset, 10,000 Testset)
Hardware Specification Yes All the experiments are conducted on CPU (16 Intel(R) Xeon(R) Gold 5222 CPU @ 3.80GHz) and GPU (one NVIDIA RTX 2080 Ti).
Software Dependencies Yes We use Gurobi 9.0 (Gurobi Optimization, 2021) to solve (P1) and (P2)
Experiment Setup Yes For efficiency, we limit the time to be 2s per sample1. ... 1The solving time for (P1) is universally set to be 2|Dtest| = 20, 000 seconds. The solving time for (P2) is set to be 2|Ω| for (P2) where Ωis defined in Eq. (15). ... Set the random seed for training; # Reproducible training. (from Algorithm 2)