Disentangling Voice and Content with Self-Supervision for Speaker Recognition
Authors: TIANCHI LIU, Kong Aik Lee, Qiongqiong Wang, Haizhou Li
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The efficacy of the proposed framework is validated via experiments conducted on the Vox Celeb and SITW datasets with 9.56% and 8.24% average reductions in EER and min DCF, respectively. |
| Researcher Affiliation | Academia | 1 Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A STAR), Singapore 2 Dept. of Electrical and Computer Engineering, National University of Singapore, Singapore 3 Dept. of Electrical and Electronic Engineering, Hong Kong Polytechnic University, Hong Kong 4 School of Data Science, The Chinese University of Hong Kong, Shenzhen, China |
| Pseudocode | No | No explicitly labeled pseudocode or algorithm blocks found. |
| Open Source Code | No | The paper does not contain any statement or link providing concrete access to the source code for the described methodology. |
| Open Datasets | Yes | The experiments are conducted on Vox Celeb1 [54], Vox Celeb2 [13], and the Speaker in the Wild (SITW) [48] datasets. |
| Dataset Splits | Yes | The experiments are conducted on Vox Celeb1 [54], Vox Celeb2 [13], and the Speaker in the Wild (SITW) [48] datasets. ... The development set for training is Vox Celeb2 Dev Set and we use Vox Celeb1 test set for evaluation. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory amounts) are provided for the experimental setup. |
| Software Dependencies | No | No specific software dependencies with version numbers (e.g., Python, PyTorch, library versions) are explicitly mentioned in the paper. |
| Experiment Setup | Yes | All the models are evaluated by the performance in terms of equal error rate (EER) and the minimum detection cost function (min DCF). Detailed descriptions of datasets, training strategy, and evaluation protocol are available in Appendix C. ... We train our models with an Adam optimizer [30] with a learning rate of 0.001. |