reproducibilityindex.ai

Scaling Sign Language Translation

Authors: Biao Zhang, Garrett Tanzer, Orhan Firat

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We perform large-scale SLT pretraining on different data... We finetune the pretrained SLT models on 5 downstream open-domain SLT benchmarks... Experiments show substantial quality improvements over the vanilla baselines, surpassing the previous state-of-the-art (SOTA) by wide margins.
Researcher Affiliation	Industry	Biao Zhang Garrett Tanzer Orhan Firat Google Deep Mind {biaojiaxing,gtanzer,orhanf}@google.com
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code	No	The used (m/By)T5 model checkpoints are publicly available, but the framework we used to finetune them with multimodal inputs has not been open sourced, so we are unable to release our code.
Open Datasets	Yes	We use the parallel sentence-level portion of MADLAD-400 [23] as the MT pretraining data. ... MADLAD-400 is publicly available, but our noisy You Tube dataset is not.
Dataset Splits	Yes	Table 1: Summary of downstream SLT benchmarks. #Train/#Dev/#Test : the number of examples in the train, dev and test split.
Hardware Specification	Yes	We pretrain models up to 1M steps using 64/64/128 TPU-v3 chips for Base/Large/XL, taking 720 days.
Software Dependencies	No	The paper mentions software components like T5, Adafactor, Media Pipe Holistic landmarks, BLEU, Chr F, and BLEURT, but does not provide specific version numbers for any of them.
Experiment Setup	Yes	For Pretraining, we use a batch size of 256 and a constant learning rate of 0.001. We optimize models with Adafactor [39], and set the maximum text input, landmark input, and text output length to 512. For Finetuning, we use a batch size of 32 and a constant learning rate of 0.0005.