Global-Local Characteristic Excited Cross-Modal Attacks from Images to Videos
Authors: Ruikui Wang, Yuanfang Guo, Yunhong Wang
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on the UCF-101 and Kinetics-400 validate the proposed method significantly improves cross-modal transferability and even surpasses stronger baseline using video models as substitute model. |
| Researcher Affiliation | Academia | 1School of Computer Science and Engineering, Beihang University, China 2Zhongguancun Laboratory, Beijing, China {rkwang, andyguo, yhwang}@buaa.edu.cn |
| Pseudocode | Yes | Algorithm 1: Global-Local Characteristic Excited Cross Modal Attack. |
| Open Source Code | Yes | Our source codes are available at https://github.com/lwmming/Cross-Modal-Attack. |
| Open Datasets | Yes | Two video recognition datasets, UCF-101 (Soomro, Zamir, and Shah 2012) and Kinetics-400 (Carreira and Zisserman 2017), are used for evaluations. Image Net-pretrained image models. |
| Dataset Splits | No | The paper does not explicitly provide details about train/validation/test dataset splits, or how validation was performed for model training. It mentions 'evaluations' and 'Attack Success Rate', but not specific data splits for validation during training or hyperparameter tuning. |
| Hardware Specification | No | The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions models like Alex Net, Res Net, Squeeze Net, VGG, TPN, Slow Fast, but does not specify software versions (e.g., Python, PyTorch, CUDA versions) used for implementation. |
| Experiment Setup | Yes | For optimization strategy, we set the maximum perturbations ϵ as 16.0, step size α as 0.005, number of iterations I as 60, λ in Eq. 5 as 0.01. For the intermediate layer l in Eq. 3, we select feature.7 for Alex Net, layer2 for Res Net101, features.6.expand3 3activation for Squeeze Net and features.20 for VGG-16, which is consistent with I2V. In practice, we set n1 as 2 and n2 as 3. |