Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Efficient Representations for Fake Speech Detection
Authors: Nishant Subramani, Delip Rao5859-5866
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present four parameter-efficient convolutional architectures for fake speech detection with best detection F1 scores of around 97 points on a large dataset of fake and bonafide speech. We show how the fake speech detection task naturally lends itself to a novel multi-task problem further improving F1 scores for a mere 0.5% increase in model parameters. |
| Researcher Affiliation | Industry | Nishant Subramani, Delip Rao AI Foundation San Francisco, California EMAIL |
| Pseudocode | No | The paper describes model architectures using block diagrams and textual descriptions (e.g., "Efficient CNN A model consisting of an input processing block, 4 convolution blocks, and a classification block."), but does not provide formal pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statements about releasing its own source code, nor does it provide a link to a code repository for the described methodology. |
| Open Datasets | Yes | We use datasets from Todisco et al. (2019), originally created for the ASVSpoof 2019 challenge. We select 226 speakers from the Mozilla Common Voice project on the English side which have 3 or more contributed utterances and have demographic information6. To construct the training set, we randomly choose 17 sentences from the English side of the IWSLT167 English to German translation training set. |
| Dataset Splits | Yes | Table 1: Summary of ASVSpoof2019 data used here Training Validation #samples 25,380 5,438. Our 5S recipe produces a set of 2624 BONAFIDE and 3842 FAKE utterances for training as well as 660 BONAFIDE and 1001 FAKE examples for validation. |
| Hardware Specification | Yes | For our experiments, we use a single NVIDIA V100 GPU unless otherwise specified. |
| Software Dependencies | No | The paper mentions using an "Adam optimizer with default parameters" and references a paper (Kingma and Ba, 2014), but it does not specify any software libraries (e.g., PyTorch, TensorFlow, scikit-learn) or their version numbers. |
| Experiment Setup | Yes | For optimization, we use mini-batch SGD with a batch size of 128 and an Adam optimizer with default parameters (α = 10 3, β = [0.9, 0.999]) (Kingma and Ba, 2014). We halve the learning rate, α, during training whenever there is no improvement in validation set loss: completely stopping when the learning rate drops below 10 5. |