Improving Gloss-free Sign Language Translation by Reducing Representation Density
Authors: Jinhui Ye, Xing Wang, Wenxiang Jiao, Junwei Liang, Hui Xiong
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that the proposed Sign CL can significantly reduce the representation density and improve performance across various translation frameworks. Specifically, Sign CL achieves a significant improvement in BLEU score for the Sign Language Transformer and GFSLT-VLP on the CSL-Daily dataset by 39% and 46%, respectively, without any increase of model parameters. |
| Researcher Affiliation | Collaboration | 1Artificial Intelligence Thrust, HKUST (Guangzhou), Guangzhou, China 2Department of Computer Science and Engineering, HKUST, Hong Kong SAR, China 3Guangzhou HKUST Fok Ying Tung Research Institute 4Tencent AI Lab |
| Pseudocode | No | The paper includes figures illustrating strategies (Figure 4a, 4b, 4c) and equations, but no structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Implementation and Checkpoints are available at https://github.com/Jinhui YE/Sign CL. |
| Open Datasets | Yes | We primarily use the PHOENIX-2014T benchmark [3] to investigate the representation density problem in existing sign feature extraction techniques. ... Specifically, Sign CL achieves a significant improvement in BLEU score for the Sign Language Transformer and GFSLT-VLP on the CSL-Daily dataset by 39% and 46%, respectively... |
| Dataset Splits | No | Due to the limited number of samples for each gesture in the dev set, we rank the sign glosses based on their density using SDR(Gi) under SMKD features (see Eqn. 1). |
| Hardware Specification | Yes | All experiments are conducted using Py Torch on 8*NVIDIA A800 GPUs for about 12 hours. |
| Software Dependencies | No | All experiments are conducted using Py Torch on 8*NVIDIA A800 GPUs for about 12 hours. This mentions 'Py Torch' but does not specify a version number, nor are other specific software versions provided. |
| Experiment Setup | Yes | Table 4: Hyperparameters of Sign Language Transformer models. Parameter PHOENIX-2014T CSL-Daily encoder-layers 3 1 decoder-layers 3 1 attention heads 8 8 ctc-layers 1 1 hidden size 512 512 activation function gelu gelu learning rate 1e-3 1e-3 Adam β (0.9, 0.98) (0.9, 0.98) label-smoothing 0.1 0.1 max output length 30 50 dropout 0.3 0.3 batch-size 128 128 |