reproducibilityindex.ai

Non-Autoregressive Neural Text-to-Speech

Authors: Kainan Peng, Wei Ping, Zhao Song, Kexin Zhao

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this section, we present several experiments to evaluate the proposed Para Net and Wave VAE. ... We report the MOS results in Table 2. ... We test synthesis speed of all models on NVIDIA Ge Force GTX 1080 Ti with 32-bit ﬂoating point (FP32) arithmetic.
Researcher Affiliation	Industry	Kainan Peng 1 Wei Ping 1 Zhao Song 1 Kexin Zhao 1 ... 1Baidu Research, 1195 Bordeaux Dr, Sunnyvale, CA. ... Correspondence to: Wei Ping <weiping.thu@gmail.com>.
Pseudocode	No	The paper describes the model architectures and algorithms in textual form and through figures, but does not include any explicit pseudocode blocks or algorithms.
Open Source Code	No	Speech samples can be found in: https:// parallel-neural-tts-demo.github.io/. We use an open source reimplementation of Fast Speech 1 by adapting the hyperparameters for handling the 24k Hz dataset.1https://github.com/xcmyz/Fast Speech. The paper links to a demo page for speech samples and references a third-party open-source implementation of Fast Speech, but does not provide access to the source code for their own proposed Para Net or Wave VAE models.
Open Datasets	No	In our experiment, we use an internal English speech dataset containing about 20 hours of speech data from a female speaker with a sampling rate of 48 k Hz.
Dataset Splits	No	The paper mentions training models for a certain number of steps and using audio clips, but does not explicitly provide details about training/validation/test dataset splits (e.g., percentages or sample counts for each split).
Hardware Specification	Yes	We test synthesis speed of all models on NVIDIA Ge Force GTX 1080 Ti with 32-bit ﬂoating point (FP32) arithmetic. ... We train all neural vocoders on 8 Nvidia 1080Ti GPUs
Software Dependencies	No	The paper mentions using Adam optimizer and certain architectures (e.g., Wave Net, Clari Net, Wave Glow), but does not specify software dependencies with version numbers (e.g., Python, TensorFlow/PyTorch versions).
Experiment Setup	Yes	Table 1. Hyperparameters of autoregressive text-to-spectrogram model and non-autoregressive Para Net in the experiment.