reproducibilityindex.ai

Global-Local Characteristic Excited Cross-Modal Attacks from Images to Videos

Authors: Ruikui Wang, Yuanfang Guo, Yunhong Wang

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on the UCF-101 and Kinetics-400 validate the proposed method signiﬁcantly improves cross-modal transferability and even surpasses stronger baseline using video models as substitute model.
Researcher Affiliation	Academia	1School of Computer Science and Engineering, Beihang University, China 2Zhongguancun Laboratory, Beijing, China {rkwang, andyguo, yhwang}@buaa.edu.cn
Pseudocode	Yes	Algorithm 1: Global-Local Characteristic Excited Cross Modal Attack.
Open Source Code	Yes	Our source codes are available at https://github.com/lwmming/Cross-Modal-Attack.
Open Datasets	Yes	Two video recognition datasets, UCF-101 (Soomro, Zamir, and Shah 2012) and Kinetics-400 (Carreira and Zisserman 2017), are used for evaluations. Image Net-pretrained image models.
Dataset Splits	No	The paper does not explicitly provide details about train/validation/test dataset splits, or how validation was performed for model training. It mentions 'evaluations' and 'Attack Success Rate', but not specific data splits for validation during training or hyperparameter tuning.
Hardware Specification	No	The paper does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies	No	The paper mentions models like Alex Net, Res Net, Squeeze Net, VGG, TPN, Slow Fast, but does not specify software versions (e.g., Python, PyTorch, CUDA versions) used for implementation.
Experiment Setup	Yes	For optimization strategy, we set the maximum perturbations ϵ as 16.0, step size α as 0.005, number of iterations I as 60, λ in Eq. 5 as 0.01. For the intermediate layer l in Eq. 3, we select feature.7 for Alex Net, layer2 for Res Net101, features.6.expand3 3activation for Squeeze Net and features.20 for VGG-16, which is consistent with I2V. In practice, we set n1 as 2 and n2 as 3.