reproducibilityindex.ai

Discriminative Feature Decoupling Enhancement for Speech Forgery Detection

Authors: Yijun Bei, Xing Zhou, Erteng Liu, Yang Gao, Sen Lin, Kewei Gao, Zunlei Feng

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	extensive experiments demonstrate that DEEM achieves an accuracy improvement of over 5% on Fo R dataset compared to the state-of-the-art methods.
Researcher Affiliation	Collaboration	1School of Software Technology, Zhejiang University 2State Key Laboratory of Blockchain and Security, Zhejiang University 3Hangzhou High-Tech Zone (Binjiang) Institute of Blockchain and Data Security 4Ningbo Donghai Group Co., Ltd.
Pseudocode	No	No pseudocode or algorithm blocks were found.
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	Specifically, we utilize publicly available datasets, namely Libri Speech ASR [Panayotov et al., 2015] and Nonspeech [Hu and Wang, 2010], to generate synthetic speech samples... Speech Forgery Benchmark Dataset. In the experimental section, this study utilizes two representative speech forgery detection datasets to evaluate the performance of the proposed algorithm. Fo R [Reimao and Tzerpos, 2019]... ASVspoof 2019 LA [Todisco et al., 2019] serves as a dataset specifically designed for ASV anti-spoofing purposes.
Dataset Splits	No	The paper mentions 'training and development stages' and 'evaluation phase' for ASVspoof2019LA, and 'standard version' for Fo R, but does not provide specific percentages or sample counts for train/validation/test splits, nor does it specify how these splits are performed.
Hardware Specification	Yes	The proposed DEEM model is implemented in Py Torch and evaluated on an NVIDIA Tesla V100 GPU.
Software Dependencies	No	The paper mentions 'Py Torch' but does not specify a version number or other software dependencies with version numbers.
Experiment Setup	Yes	In the decoupled training phase, a learning rate adjustment strategy is employed, with an initial learning rate set to 0.0001. If the loss value does not exhibit a significant reduction after five consecutive training iterations, the learning rate is decreased. The Adam optimizer is utilized, and the training process is carried out for 150 epochs, employing the mean squared error (MSE) loss function. In the subsequent classification training phase, the same learning rate and optimizer settings are applied. The training is performed for 160 epochs, employing the cross-entropy loss function.