Flow-Attention-based Spatio-Temporal Aggregation Network for 3D Mask Detection
Authors: Yuxin Cao, Yian Li, Yumeng Zhu, Derui Wang, Minhui Xue
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through extensive experiments, FASTEN only requires five frames of input and outperforms eight competitors for both intra-dataset and cross-dataset evaluations in terms of multiple detection metrics. |
| Researcher Affiliation | Collaboration | Yuxin Cao Tsinghua University China Yian Li Shanghai Tech University China Yumeng Zhu Ping An Technology (Shenzhen) Co., Ltd. China Derui Wang CSIRO s Data61 Australia Minhui Xue CSIRO s Data61 Australia This work was done while Yuxin and Yian worked as interns at Ping An Technology (Shenzhen) Co., Ltd. |
| Pseudocode | No | The paper describes its proposed network architecture and components but does not include any formal pseudocode or algorithm blocks. |
| Open Source Code | Yes | Please refer to our code for more details 4. https://github.com/Joseph Cao0327/FASTEN |
| Open Datasets | Yes | We conduct our experiments on three frequently used datasets, 3DMAD (3D Mask Attack Dataset) [19], HKBU-MARs V1+ (Hong Kong Baptist University 3D Mask Attack with Real-World Variations) [48] and Hi Fi Mask (High-Fidelity Mask dataset) [29]. |
| Dataset Splits | Yes | Then we randomly select eight/five subjects for training and the leftover eight/six subjects for validation on 3DMAD/HKBU-MARs V1+. Instead, we use k-fold cross validation (KCV) when training on Hi Fi Mask. In this case, we follow Protocol 1 [29], where 45, 6 and 24 subjects are selected for training, validation and testing, respectively. |
| Hardware Specification | Yes | all models are trained using two Tesla V100 GPUs. The test experiments are conducted on a Samsung and a Xiaomi smartphone. |
| Software Dependencies | No | The paper mentions software components like MobileNet V3-Small, Adam W optimizer, Flow Net2.0, and mobilenetv3.pytorch via footnotes to repositories, but it does not specify exact version numbers for these software dependencies within the text. |
| Experiment Setup | Yes | Flow Net Face is trained (finetuned) using Adam W optimizer with a learning rate of 1e-4 (2e-5) and weight decay as 4e-4 (1e-2 for 3DMAD and HKBU-MARs V1+, and 1e-3 for Hi Fi Mask). The rest of the network is finetuned using the same optimizer with a learning rate of 2e-4, which is adjusted by cosine annealing with warm restarts. We set the warm-up epoch number as 2 and the total epoch number as 150. We set the required input frame number λ = 5, the balancing weight µ = 0.5 for 3DMAD and HKBU-MARs V1+ and µ = 3 for Hi Fi Mask. The batch size is 120. |