Flow-Attention-based Spatio-Temporal Aggregation Network for 3D Mask Detection

Authors: Yuxin Cao, Yian Li, Yumeng Zhu, Derui Wang, Minhui Xue

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through extensive experiments, FASTEN only requires five frames of input and outperforms eight competitors for both intra-dataset and cross-dataset evaluations in terms of multiple detection metrics.
Researcher Affiliation Collaboration Yuxin Cao Tsinghua University China Yian Li Shanghai Tech University China Yumeng Zhu Ping An Technology (Shenzhen) Co., Ltd. China Derui Wang CSIRO s Data61 Australia Minhui Xue CSIRO s Data61 Australia This work was done while Yuxin and Yian worked as interns at Ping An Technology (Shenzhen) Co., Ltd.
Pseudocode No The paper describes its proposed network architecture and components but does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes Please refer to our code for more details 4. https://github.com/Joseph Cao0327/FASTEN
Open Datasets Yes We conduct our experiments on three frequently used datasets, 3DMAD (3D Mask Attack Dataset) [19], HKBU-MARs V1+ (Hong Kong Baptist University 3D Mask Attack with Real-World Variations) [48] and Hi Fi Mask (High-Fidelity Mask dataset) [29].
Dataset Splits Yes Then we randomly select eight/five subjects for training and the leftover eight/six subjects for validation on 3DMAD/HKBU-MARs V1+. Instead, we use k-fold cross validation (KCV) when training on Hi Fi Mask. In this case, we follow Protocol 1 [29], where 45, 6 and 24 subjects are selected for training, validation and testing, respectively.
Hardware Specification Yes all models are trained using two Tesla V100 GPUs. The test experiments are conducted on a Samsung and a Xiaomi smartphone.
Software Dependencies No The paper mentions software components like MobileNet V3-Small, Adam W optimizer, Flow Net2.0, and mobilenetv3.pytorch via footnotes to repositories, but it does not specify exact version numbers for these software dependencies within the text.
Experiment Setup Yes Flow Net Face is trained (finetuned) using Adam W optimizer with a learning rate of 1e-4 (2e-5) and weight decay as 4e-4 (1e-2 for 3DMAD and HKBU-MARs V1+, and 1e-3 for Hi Fi Mask). The rest of the network is finetuned using the same optimizer with a learning rate of 2e-4, which is adjusted by cosine annealing with warm restarts. We set the warm-up epoch number as 2 and the total epoch number as 150. We set the required input frame number λ = 5, the balancing weight µ = 0.5 for 3DMAD and HKBU-MARs V1+ and µ = 3 for Hi Fi Mask. The batch size is 120.