reproducibilityindex.ai

Revisiting End-to-End Speech-to-Text Translation From Scratch

Authors: Biao Zhang, Barry Haddow, Rico Sennrich

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On four benchmarks covering 23 languages, our experiments show that, without using any transcripts or pretraining, the proposed system reaches and even outperforms previous studies adopting pretraining, although the gap remains in (extremely) low-resource settings. Experimental results show that the significance of pretraining has been over-estimated in prior work, and integrating techniques to improve E2E ST from scratch is feasible and promising.
Researcher Affiliation	Academia	1School of Informatics, University of Edinburgh 2Department of Computational Linguistics, University of Zurich.
Pseudocode	No	The paper describes the methods and algorithms in natural language and mathematical equations, but it does not include any formal pseudocode or algorithm blocks.
Open Source Code	Yes	Source code is available at https://github.com/ bzhang Go/zero.
Open Datasets	Yes	We work on four benchmarks covering different domains and 23 languages from diverse language families. Mu ST-C ... (Di Gangi et al., 2019), ... Libri Speech En-Fr ... (Kocabiyikoglu et al., 2018). ... Kosp2e Ko-En ... (Cho et al., 2021). ... Co Vo ST ... (Ardila et al., 2020).
Dataset Splits	Yes	For each benchmark, we use the official train/dev/test split for experiments.
Hardware Specification	No	The paper does not specify any particular hardware used for running the experiments (e.g., GPU models, CPU types, or cloud computing instance details).
Software Dependencies	No	The paper mentions software tools like "Adam" (Kingma & Ba, 2015), "Moses" (Koehn et al., 2007), and "Sacre BLEU" (Post, 2018), but it does not provide specific version numbers for any of these software dependencies.
Experiment Setup	Yes	We employ Adam (Kingma & Ba, 2015, β1 = 0.9, β2 = 0.98) for parameter update using adaptive learning rate schedule as in (Vaswani et al., 2017) with a warmup step of 4K and label smoothing of 0.1. Dropout of rate 0.2 is applied to residual connections and Re LU activations. We organize training samples of around 20K target subwords into one batch, and train models up to 50K steps.