MixACM: Mixup-Based Robustness Transfer via Distillation of Activated Channel Maps

Authors: Awais Muhammad, Fengwei Zhou, Chuanlong Xie, Jiawei Li, Sung-Ho Bae, Zhenguo Li

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, extensive experiments on multiple datasets and different learning scenarios show our method can transfer robustness while also improving generalization on natural images.
Researcher Affiliation Collaboration 1 Huawei Noah s Ark Lab 2 Department of Computer Science, Kyung-Hee University, South Korea
Pseudocode No No explicit pseudocode or algorithm blocks were found in the paper.
Open Source Code No The webpage for the project is available at: awaisrauf.github.io/Mix ACM. This is a project webpage, not a direct link to a source code repository.
Open Datasets Yes We have conducted extensive experiments to show the effectiveness of our method using various datasets and under different learning settings. ... on CIFAR-10, CIFAR100, and Image Net datasets.
Dataset Splits No The paper mentions using CIFAR-10, CIFAR-100, and ImageNet datasets but does not explicitly describe the training, validation, and test splits used for these datasets or provide citations for predefined splits beyond using 'test accuracy' for evaluation.
Hardware Specification No The paper does not provide specific hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions 'Py Torch implementation' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes For CIFAR experiments, ... We trained them for 200 epochs, using batch size of 128, a learning rate of 0.1, cosine learning rate scheduler [46], momentum optimizer with weight decay of 0.0005. For our loss, we use αacm = 5000. For KD loss, we use temperature value of γ = 10 and αkld = 0.95 and the value for mixup coefficient is αmixup = 1 whereas λ Beta(αmixup, αmixup) following [95]. Image Net models are trained for 120 epochs.