Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
MixedNUTS: Training-Free Accuracy-Robustness Balance via Nonlinearly Mixed Classifiers
Authors: Yatong Bai, Mo Zhou, Vishal M. Patel, Somayeh Sojoudi
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | On CIFAR-10, CIFAR-100, and Image Net datasets, experimental results with custom strong adaptive attacks demonstrate Mixed NUTS s vastly improved accuracy and near-SOTA robustness it boosts CIFAR-100 clean accuracy by 7.86 points, sacrificing merely 0.87 points in robust accuracy. 5 Experiments |
| Researcher Affiliation | Academia | Yatong Bai EMAIL University of California, Berkeley Mo Zhou EMAIL John Hopkins University Vishal M. Patel EMAIL John Hopkins University Somayeh Sojoudi EMAIL University of California, Berkeley |
| Pseudocode | Yes | Algorithm 1 Algorithm for optimizing s, p, c, and α. 1: Given an image set, save the predicted logits associated with mispredicted clean images h LN rob(x) : x e X clean . 2: Run MMAA on h LN rob( ) and save the logits of correctly classified perturbed inputs h LN rob(x) : x e A adv . |
| Open Source Code | Yes | Please refer to our source code for details on implementation. |
| Open Datasets | Yes | Our evaluation uses CIFAR-10 (Krizhevsky, 2009), CIFAR-100 (Krizhevsky, 2009), and Image Net (Deng et al., 2009) datasets. |
| Dataset Splits | Yes | Uses 5000 validation images as specified in Robust Bench. ... All mixed classifiers are evaluated with strengthened adaptive Auto Attack algorithms specialized in attacking Mixed NUTS and do not manifest gradient obfuscation issues, with the details explained in Appendix B.2. ... As in Figure 6, 10000 clean examples and 1000 Auto Attack examples are used. |
| Hardware Specification | Yes | In practice, the triply-nested grid search loop can be completed within ten seconds on a laptop computer, and performing MMAA on 1000 images requires 3752/10172 seconds for CIFAR-100/Image Net with a single Nvidia RTX-8000 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch realization is Tensor.detach()' and refers to 'Auto Attack' and 'Robust Bench' components, but does not provide specific version numbers for these software dependencies or any other libraries. |
| Experiment Setup | Yes | Our experiments use β = 98.5% for CIFAR-10 and -100, and use β = 99.0% for Image Net. The optimal s, p, c values and the searching grid used in Algorithm 1 are discussed in Appendix C.1. ...In our experiments, we generate uniform linear intervals as the candidate values for the power coefficient p and the bias coefficient c, and use a log-scale interval for the scale coefficient s. Each interval has eight numbers, with the minimum and maximum values for the intervals listed in Table 3. |