reproducibilityindex.ai

Temporal-Distributed Backdoor Attack against Video Based Action Recognition

Authors: Xi Li, Songhe Wang, Ruiquan Huang, Mahanth Gowda, George Kesidis

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness of the proposed attack is demonstrated by extensive experiments with various well-known models on two video recognition benchmarks, UCF101 and HMDB51, and a sign language recognition benchmark, Greek Sign Language (GSL) dataset. We delve into the impact of several influential factors on our proposed attack and identify an intriguing effect termed collateral damage through extensive studies.
Researcher Affiliation	Academia	The Pennsylvania State University {xzl45, sxw5765, rzh5514, mkg31, gik2}@psu.edu
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any explicit statements or links indicating the availability of open-source code for the methodology described.
Open Datasets	Yes	Datasets: We consider two benchmark datasets used in video action recognition, UCF-101 (Soomro, Zamir, and Shah 2012) and HMDB-51 (Kuehne et al. 2011), and a sign language recognition benchmark, Greek Sign Language (GSL) dataset (Adaloglou et al. 2022).
Dataset Splits	No	The paper does not explicitly state the training, validation, and test dataset splits with percentages or counts for their experiments. While it mentions "test set" and "training samples", a validation split is not specified.
Hardware Specification	No	The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU specifications, or cloud computing instance types.
Software Dependencies	No	The paper mentions using "Adam W optimizer (Loshchilov and Hutter 2019)" but does not specify version numbers for any programming languages, libraries, or other software dependencies necessary for replication.
Experiment Setup	Yes	Training Settings: We train all the models on all the datasets for 10 epochs, using the Adam W optimizer (Loshchilov and Hutter 2019) with an initial learning rate of 0.0003. Following the common training strategy in video recognition (Hammoud et al. 2023) and for reducing computation cost, we down-sample the videos into 32 frames.