Maximization of Average Precision for Deep Learning with Adversarial Ranking Robustness
Authors: Gang Li, Wei Tong, Tianbao Yang
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our empirical studies, which compare our method to current leading adversarial training baselines and other robust AP maximization strategies, demonstrate the effectiveness of the proposed approach. Notably, our methods outperform a state-of-the-art method (TRADES) by more than 4% in terms of robust AP against PGD attacks while achieving 7% higher AP on clean data simultaneously on CIFAR10 and CIFAR100. |
| Researcher Affiliation | Collaboration | Gang Li Texas A&M University College Station, USA gang-li@tamu.edu Wei Tong General Motors Warren, USA wei.tong@gm.com Tianbao Yang Texas A&M University College Station, USA tianbao-yang@tamu.edu |
| Pseudocode | Yes | Algorithm 1 Stochastic Algorithm for Solving Ad AP_LN in (9) |
| Open Source Code | Yes | The code is available at: https://github.com/Gang Lii/Adversarial-AP |
| Open Datasets | Yes | Datasets. We conduct experiments on four distinct datasets sourced from various domains. These encompass CIFAR-10 and CIFAR-100 datasets [22], Celeb A dataset [26] and the BDD100K dataset [57]. |
| Dataset Splits | Yes | We split the training dataset into train/validation sets at 80%/20% ratio, and use the testing dataset for testing. |
| Hardware Specification | Yes | All experiments in our paper are run across 16 NVIDIA A10 GPUs and 10 NVIDIA A30 GPUs. |
| Software Dependencies | No | The paper mentions using ResNet18 as a backbone network and the Adam optimizer, but it does not provide specific version numbers for any software libraries, programming languages, or development environments used for the experiments. |
| Experiment Setup | Yes | We employ the Res Net18 [19] as the backbone network in our experiments. [...] For all methods, with mini-batch size as 128, we tune learning rate in {1e-3,1e-4,1e-5} with standard Adam optimizer. We set the weight decay to 2e-4 for the CIFAR10 and CIFAR100 datasets and 1e-5 for the Celeb A and BDD100k datasets. In the case of the CIFAR10 and CIFAR100 datasets, we run each method for a total of 60 epochs. For the Celeb A and BDD100k datasets, we run each method for 32 epochs. The learning rate decay happens at 50% and 75% epochs by a factor of 10. For MART, Ad AP_LN and Ad AP_LPN, we tune the regularization parameter λ in {0.1, 0.4, 0.8, 1, 4, 8, 10}. For TRADES, we tune the regularization parameter in {1, 4, 8, 10, 40, 80, 100}, since they favor larger weights to obtain better robustness. In addition, for Ad AP_LN and Ad AP_LPN, we tune its moving average parameters γ1, γ2 in {0.1, 0.9}. Similarly, we tune the moving average parameters γ1 for AP maximization in {0.1, 0.9}. We set margin parameter in the surrogate loss of AP as 0.6 for all methods that use the AP surrogate loss. For all adversarial training methods, we apply 6 projected gradient ascent steps to generate adversarial samples in the training stage, and the step size is 0.01. We choose L norm to bound the perturbation within the limit of ϵ = 8/255, as it is commonly used in the literature. |