reproducibilityindex.ai

Glyce: Glyph-vectors for Chinese Character Representations

Authors: Yuxian Meng, Wei Wu, Fei Wang, Xiaoya Li, Ping Nie, Fan Yin, Muyu Li, Qinghong Han, Xiaofei Sun, Jiwei Li

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that glyph-based models are able to consistently outperform word/char ID-based models in a wide range of Chinese NLP tasks. We are able to set new stateof-the-art results for a variety of Chinese NLP tasks, including tagging (NER, CWS, POS), sentence pair classiﬁcation, single sentence classiﬁcation tasks, dependency parsing, and semantic role labeling.
Researcher Affiliation	Industry	Shannon.AI {yuxian meng, wei wu, fei wang, xiaoya li, ping nie, fan yin, muyu li, qinghong han, xiaofei sun, jiwei li}@shannonai.com
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	code is available at https://github.com/Shannon AI/glyce.
Open Datasets	Yes	NER For the task of Chinese NER, we used the widely-used Onto Notes, MSRA, Weibo and resume datasets. We used the widely-used PKU, MSR, CITYU and AS benchmarks from SIGHAN 2005 bake-off for evaluation. We use the CTB5, CTB9 and UD1 (Universal Dependencies) benchmarks to test our models. We employ the following four different datasets: (1) BQ (binary classiﬁcation task) [Bowman et al., 2015]; (2) LCQMC (binary classiﬁcation task) [Liu et al., 2018], (3) XNLI (three-class classiﬁcation task) [Williams and Bowman], and (4) NLPCC-DBQA. Datasets that we use include: (1) Chn Senti Corp (binary classiﬁcation); (2) the Fudan corpus (5-class classiﬁcation) [Li, 2011]; and (3) Ifeng (5-class classiﬁcation). For dependency parsing [Chen and Manning, 2014, Dyer et al., 2015], we used the widely-used Chinese Penn Treebank 5.1 dataset for evaluation. For the task of semantic role labeling (SRL) [Roth and Lapata, 2016, Marcheggiani and Titov, 2017, He et al., 2018], we used the Co NLL-2009 shared-task.
Dataset Splits	Yes	To enable apples-to-apples comparison, we perform grid parameter search for both baselines and the proposed model on the dev set.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers, such as programming languages, libraries, or frameworks used for implementation.
Experiment Setup	No	The paper mentions performing a 'grid parameter search' but does not provide concrete hyperparameter values or detailed training configurations within the main text.