MA-ViT: Modality-Agnostic Vision Transformers for Face Anti-Spoofing
Authors: Ajian Liu, Yanyan Liang
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments demonstrate that the single model trained based on MA-Vi T can not only flexibly evaluate different modal samples, but also outperforms existing single-modal frameworks by a large margin, and approaches the multi-modal frameworks introduced with smaller FLOPs and model parameters. |
| Researcher Affiliation | Academia | Ajian Liu1,2 , Yanyan Liang1 1School of Computer Science and Engineering, Faculty of Innovation Engineering, Macau University of Science and Technology, Macau 2CBSR&NLPR, Institute of Automation, Chinese Academy of Sciences, Beijing, China ajianliu92@gmail.com, yyliang@must.edu.mo |
| Pseudocode | No | No structured pseudocode or algorithm blocks were found in the paper. The methodology is described in prose. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code for the described methodology or a link to a code repository. |
| Open Datasets | Yes | We use three commonly used multi-modal and a single-modal FAS datasets for experiments, including CASIA-SURF (Mm FA) [Zhang et al., 2019], CASIA-SURF Ce FA (Ce FA) [Liu et al., 2021b], WMCA [George et al., 2019] and OULU-NUPU [Boulkenafet et al., 2017] (OULU). |
| Dataset Splits | Yes | We use three commonly used multi-modal and a single-modal FAS datasets for experiments, including CASIA-SURF (Mm FA) [Zhang et al., 2019], CASIA-SURF Ce FA (Ce FA) [Liu et al., 2021b], WMCA [George et al., 2019] and OULU-NUPU [Boulkenafet et al., 2017] (OULU). Mm FA consists of 1, 000 subjects with 21, 000 videos and each sample has 3 modalities, and provides a intra-testing protocol. Ce FA covers 3 ethnicities, 3 modalities, 1, 607 subjects, and provides five protocols to measure the affect under varied conditions. We select the Protocol 1, 2, and 4 for experiments. WMCA contains a wide variety of 2D and 3D presentation attacks, which introduces 2 protocols: grandtest protocol which emulates the seen attack scenario and the unseen attack protocol that evaluates the generalization on an unseen attack. [...] The ACER on testing set is determined by the Equal Error Rate (EER) threshold on dev sets for Mm FA, Ce FA, OULU, and the BPCER=1% threshold for WMCA. |
| Hardware Specification | No | The paper does not specify the exact hardware (e.g., GPU models, CPU types) used for running the experiments. It only states training details like number of epochs, batch size, and learning rate. |
| Software Dependencies | No | The paper mentions using 'Adam solver' but does not specify versions for any programming languages, libraries, or other software dependencies required to reproduce the experiments. |
| Experiment Setup | Yes | We resize all modal images to 224x224 and train all models with 50 epochs via Adam solver. All models are trained with a batch seize of 8 and an initial learning rate of 0.0001 for all epochs. We set λ = 0.8 in MDA according to comparative experiments. |