What to Remember: Self-Adaptive Continual Learning for Audio Deepfake Detection
Authors: XiaoHui Zhang, Jiangyan Yi, Chenglong Wang, Chu Yuan Zhang, Siding Zeng, Jianhua Tao
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental evaluations against mainstream continual learning methods reveal the superiority of RWM in terms of knowledge acquisition and mitigating forgetting in audio deepfake detection. Furthermore, RWM s applicability extends beyond audio deepfake detection, demonstrating its potential significance in diverse machine learning domains such as image recognition. |
| Researcher Affiliation | Academia | 1 State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing, China 2 School of Computer and Information Technology, University of Beijing Jiaotong, Beijing, China 3 Department of Automation, Tsinghua University, Beijing, China 4 University of Science and Technology of China, Beijing, China. |
| Pseudocode | Yes | Algorithm 1: Radian Weight Modification |
| Open Source Code | Yes | The code of the RWM has been uploaded in the supplemental material. In the foreseeable future, we plan to make the code of our method publicly available to facilitate its adoption and further research. |
| Open Datasets | Yes | We evaluate our approach on three fake audio datasets: ASVspoof2019LA (S) (Todisco et al. 2019), ASVspoof2015 (T1) (Wu et al. 2015), and In-the-Wild (T2) (M uller et al. 2022). [...] We use the CLEAR benchmark to evaluate the performance of our method for image recognition. |
| Dataset Splits | Yes | The train and evaluation datasets of each labeled subset are generated by using the classic 70/30% train-test split as Table 6 in our supplementary material. |
| Hardware Specification | No | The paper does not provide specific hardware details for the experimental setup. |
| Software Dependencies | No | The paper mentions software components like Wav2vec 2.0, XLSR-53, S-CNN, ResNet 50, Adam, and SGD, but does not provide specific version numbers for them. |
| Experiment Setup | Yes | We finetune the XLSR-53 and S-CNN using the Adam optimizer with a learning rate γ of 0.0001 and a batch size of 2. [...] The experiment used a batch size of 512 and an initial learning rate of 1, which decayed by a factor of 0.1 after 60 epochs. We employed the SGD optimizer with a momentum of 0.9. The α in Eq 1 is 0.1 and the norm in Eq 10 is L2 norm. |