Learning Multi-Level Dependencies for Robust Word Recognition
Authors: Zhiwei Wang, Hui Liu, Jiliang Tang, Songfan Yang, Gale Yan Huang, Zitao Liu9250-9257
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to verify the effectiveness of the framework. The results show that the proposed framework outperforms state-of-the-art methods by a large margin and they also suggest that character-level dependencies can play an important role in word recognition. |
| Researcher Affiliation | Collaboration | Zhiwei Wang,1 Hui Liu,1 Jiliang Tang,1 Songfan Yang,2 Gale Yan Huang,2 Zitao Liu2 1Michigan State University, {wangzh65, liuhui7, tangjili}@msu.edu 2TAI AI Lab, TAL Education Group, {yangsongfan, galehuang, liuzitao}@100tal.com |
| Pseudocode | No | The paper describes the model architecture and training procedures in text but does not include any explicit pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code of the proposed framework and the major experiments are publicly available1. 1https://github.com/DSE-MSU/MUDE |
| Open Datasets | Yes | We use the publicly available Penn Treebank (Marcus, Santorini, and Marcinkiewicz 1993) as the dataset. |
| Dataset Splits | Yes | We use the same training, validation and testing split in (Sakaguchi et al. 2017), which contains 39,832, 1,700 and 2,416 sentences, respectively. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for experiments (e.g., GPU/CPU models, memory). |
| Software Dependencies | No | The paper mentions "Pytorch" but does not specify any version numbers for PyTorch or other software dependencies, which is required for reproducibility. |
| Experiment Setup | Yes | The number of hidden units of word representations is set to be 650 as suggested by previous work (Sakaguchi et al. 2017). The learning rate is chosen from {0.1, 0.01, 0.001, 0.0001} and β in Eq (11) is chosen from {1, 0.1, 0.001} according to the model performance on the validation datasets. The parameters of MUDE are learned with stochastic gradient decent algorithm and we choose RMSprop (Tieleman and Hinton 2012) to be the optimizer as it did in (Sakaguchi et al. 2017). |