Discriminative Feature Decoupling Enhancement for Speech Forgery Detection
Authors: Yijun Bei, Xing Zhou, Erteng Liu, Yang Gao, Sen Lin, Kewei Gao, Zunlei Feng
IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | extensive experiments demonstrate that DEEM achieves an accuracy improvement of over 5% on Fo R dataset compared to the state-of-the-art methods. |
| Researcher Affiliation | Collaboration | 1School of Software Technology, Zhejiang University 2State Key Laboratory of Blockchain and Security, Zhejiang University 3Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security 4Ningbo Donghai Group Co., Ltd. |
| Pseudocode | No | No pseudocode or algorithm blocks were found. |
| Open Source Code | No | The paper does not provide an explicit statement or link to open-source code for the described methodology. |
| Open Datasets | Yes | Specifically, we utilize publicly available datasets, namely Libri Speech ASR [Panayotov et al., 2015] and Nonspeech [Hu and Wang, 2010], to generate synthetic speech samples... Speech Forgery Benchmark Dataset. In the experimental section, this study utilizes two representative speech forgery detection datasets to evaluate the performance of the proposed algorithm. Fo R [Reimao and Tzerpos, 2019]... ASVspoof 2019 LA [Todisco et al., 2019] serves as a dataset specifically designed for ASV anti-spoofing purposes. |
| Dataset Splits | No | The paper mentions 'training and development stages' and 'evaluation phase' for ASVspoof2019LA, and 'standard version' for Fo R, but does not provide specific percentages or sample counts for train/validation/test splits, nor does it specify how these splits are performed. |
| Hardware Specification | Yes | The proposed DEEM model is implemented in Py Torch and evaluated on an NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with version numbers. |
| Experiment Setup | Yes | In the decoupled training phase, a learning rate adjustment strategy is employed, with an initial learning rate set to 0.0001. If the loss value does not exhibit a significant reduction after five consecutive training iterations, the learning rate is decreased. The Adam optimizer is utilized, and the training process is carried out for 150 epochs, employing the mean squared error (MSE) loss function. In the subsequent classification training phase, the same learning rate and optimizer settings are applied. The training is performed for 160 epochs, employing the cross-entropy loss function. |