Learning Efficient Representations for Fake Speech Detection
Authors: Nishant Subramani, Delip Rao5859-5866
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present four parameter-efficient convolutional architectures for fake speech detection with best detection F1 scores of around 97 points on a large dataset of fake and bonafide speech. We show how the fake speech detection task naturally lends itself to a novel multi-task problem further improving F1 scores for a mere 0.5% increase in model parameters. |
| Researcher Affiliation | Industry | Nishant Subramani, Delip Rao AI Foundation San Francisco, California {nishant, delip}@aifoundation.com |
| Pseudocode | No | The paper describes model architectures using block diagrams and textual descriptions (e.g., "Efficient CNN A model consisting of an input processing block, 4 convolution blocks, and a classification block."), but does not provide formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statements about releasing its own source code, nor does it provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | We use datasets from Todisco et al. (2019), originally created for the ASVSpoof 2019 challenge. We select 226 speakers from the Mozilla Common Voice project on the English side which have 3 or more contributed utterances and have demographic information6. To construct the training set, we randomly choose 17 sentences from the English side of the IWSLT167 English to German translation training set. |
| Dataset Splits | Yes | Table 1: Summary of ASVSpoof2019 data used here Training Validation #samples 25,380 5,438. Our 5S recipe produces a set of 2624 BONAFIDE and 3842 FAKE utterances for training as well as 660 BONAFIDE and 1001 FAKE examples for validation. |
| Hardware Specification | Yes | For our experiments, we use a single NVIDIA V100 GPU unless otherwise specified. |
| Software Dependencies | No | The paper mentions using an "Adam optimizer with default parameters" and references a paper (Kingma and Ba, 2014), but it does not specify any software libraries (e.g., PyTorch, TensorFlow, scikit-learn) or their version numbers. |
| Experiment Setup | Yes | For optimization, we use mini-batch SGD with a batch size of 128 and an Adam optimizer with default parameters (α = 10 3, β = [0.9, 0.999]) (Kingma and Ba, 2014). We halve the learning rate, α, during training whenever there is no improvement in validation set loss: completely stopping when the learning rate drops below 10 5. |