Improving Adversarial Robust Fairness via Anti-Bias Soft Label Distillation
Authors: Shiji Zhao, Ranjie Duan, xizhewang , Xingxing Wei
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that ABSLD outperforms state-of-theart AT, ARD, and robust fairness methods in the comprehensive metric (Normalized Standard Deviation) of robustness and fairness. |
| Researcher Affiliation | Collaboration | Shiji Zhao1, Ranjie Duan2, Xizhe Wang1, Xingxing Wei1 1Institute of Artificial Intelligence, Beihang University, Beijing, China 2Security Department, Alibaba Group, Hangzhou, China |
| Pseudocode | Yes | Algorithm 1 Overview of ABSLD |
| Open Source Code | Yes | The code can be found in https://github.com/zhaoshiji123/ABSLD. |
| Open Datasets | Yes | We conduct our experiments on three datasets: CIFAR-10 [16], CIFAR-100, and Tiny-Image Net [17]. |
| Dataset Splits | No | The paper mentions using a validation strategy for checkpoint selection ("The checkpoint is selected based on the best checkpoint..."), but does not explicitly provide specific dataset split percentages or counts for validation. |
| Hardware Specification | Yes | All the experiments are conducted in a single Ge Force RTX 3090, and our ABSLD takes approximately one GPU day for training a model. |
| Software Dependencies | No | The paper does not explicitly list specific software dependencies with version numbers. |
| Experiment Setup | Yes | For ABSLD, we train the model using the Stochastic Gradient Descent (SGD) optimizer with an initial learning rate of 0.1, a momentum of 0.9, and a weight decay of 2e-4. The learning rate β of temperature is initially set as 0.1. For CIFAR-10 and CIFAR-100, we set the training epochs to 300. The learning rate is divided by 10 at the 215-th, 260-th, and 285-th epochs; We set the batch size to 128 for both CIFAR-10 and CIFAR-100 following [45]. For the inner maximization, we use a 10-step PGD with a random start size of 0.001 and a step size of 2/255, and the perturbation is bounded to the L norm ϵ = 8/255. |