Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation

Authors: Shiji Zhao, Ranjie Duan, xizhewang , Xingxing Wei

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that ABSLD outperforms state-of-theart AT, ARD, and robust fairness methods in the comprehensive metric (Normalized Standard Deviation) of robustness and fairness.
Researcher Affiliation Collaboration Shiji Zhao1, Ranjie Duan2, Xizhe Wang1, Xingxing Wei1 1Institute of Artificial Intelligence, Beihang University, Beijing, China 2Security Department, Alibaba Group, Hangzhou, China
Pseudocode Yes Algorithm 1 Overview of ABSLD
Open Source Code Yes The code can be found in https://github.com/zhaoshiji123/ABSLD.
Open Datasets Yes We conduct our experiments on three datasets: CIFAR-10 [16], CIFAR-100, and Tiny-Image Net [17].
Dataset Splits No The paper mentions using a validation strategy for checkpoint selection ("The checkpoint is selected based on the best checkpoint..."), but does not explicitly provide specific dataset split percentages or counts for validation.
Hardware Specification Yes All the experiments are conducted in a single Ge Force RTX 3090, and our ABSLD takes approximately one GPU day for training a model.
Software Dependencies No The paper does not explicitly list specific software dependencies with version numbers.
Experiment Setup Yes For ABSLD, we train the model using the Stochastic Gradient Descent (SGD) optimizer with an initial learning rate of 0.1, a momentum of 0.9, and a weight decay of 2e-4. The learning rate β of temperature is initially set as 0.1. For CIFAR-10 and CIFAR-100, we set the training epochs to 300. The learning rate is divided by 10 at the 215-th, 260-th, and 285-th epochs; We set the batch size to 128 for both CIFAR-10 and CIFAR-100 following [45]. For the inner maximization, we use a 10-step PGD with a random start size of 0.001 and a step size of 2/255, and the perturbation is bounded to the L norm ϵ = 8/255.