Self-Emphasizing Network for Continuous Sign Language Recognition

Authors: Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental A comprehensive comparison with previous methods equipped with hand and face features demonstrates the superiority of our method, even though they always require huge computations and rely on expensive extra supervision. Remarkably, with few extra computations, SEN achieves new state-of-the-art accuracy on four large-scale datasets, PHOENIX14, PHOENIX14T, CSL-Daily, and CSL. Visualizations verify the effects of SEN on emphasizing informative spatial and temporal features.
Researcher Affiliation Academia Lianyu Hu, Liqing Gao, Zekang Liu, Wei Feng* College of Intelligence and Computing, Tianjin University, Tianjin 300350, China hly2021,lqgao,lzk100953@tju.edu.cn,wfeng@ieee.org
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes Code is available at https://github.com/hulianyuyy/SEN CSLR
Open Datasets Yes Datasets. PHOENIX14 (Koller, Forster, and Ney 2015) and PHOENIX14-T (Camgoz et al. 2018)... CSL-Daily (Zhou et al. 2021)... CSL (Huang et al. 2018)...
Dataset Splits Yes PHOENIX14 [...] divided into 5672/7096 training samples, 540/519 development (Dev) samples and 629/642 testing (Test) samples. CSL-Daily [...] divided into 18401 training samples, 1077 development (Dev) samples and 1176 testing (Test) samples. CSL [...] divided into training and testing sets by a ratio of 8:2.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions using ResNet18 and Adam optimizer, but does not provide specific version numbers for software dependencies or libraries.
Experiment Setup Yes We train our model for 80 epochs with initial learning rate 0.0001 decayed by 5 after 40 and 60 epochs. Adam optimizer is adopted with weight decay 0.001 and batch size 2. All frames are first resized to 256 256 and then randomly cropped to 224 224, with 50% horizontal flip and 20% random temporal scaling during training. During inference, a central 224 224 crop is simply selected.