Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Unsupervised Learning Helps Supervised Neural Word Segmentation
Authors: Xiaobin Wang, Deng Cai, Linlin Li, Guangwei Xu, Hai Zhao, Luo Si7200-7207
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on standard data sets show that the explored strategies indeed improve the recall rate of out-of-vocabulary words and thus boost the segmentation accuracy. Moreover, the model enhanced by the proposed methods outperforms state-of-the-art models in closed test and shows promising improvement trend when adopting three different strategies with the help of a large unlabeled data set. Our thorough empirical study eventually veri๏ฌes the proposed approach outperforms the widelyused pre-training approach in terms of effectively making use of freely abundant unlabeled data. |
| Researcher Affiliation | Collaboration | Xiaobin Wang,1 Deng Cai,2 Linlin Li,1 Guangwei Xu,1 Hai Zhao,3 Luo Si1 1Alibaba Group, 2The Chinese University of Hong Kong, 3Shanghai Jiao Tong University |
| Pseudocode | Yes | Algorithm 1 multi-task learning with unlabeled data |
| Open Source Code | No | The baseline model implementation is cloned from Github for the baseline segmenter 4. https://github.com/jcyk/greedy CWS. ... We used an open-source version of NPYLM based segmenter5 as the unsupervised segmenter, which generates segmented texts for the label embedding and multi-task learning approaches. https://github.com/musyoku/python-npylm. The paper provides links to third-party code used, but not its own implementation of the proposed methods. |
| Open Datasets | Yes | We evaluate the effectiveness of our methods by F1-score on the widely used benchmark datasets, i.e., PKU, MSR, AS and CITYU, from the 2nd international CWS Bakeoff (Bakeoff-2005) (Emerson 2005). |
| Dataset Splits | Yes | Table 3: Statistics of the dataset, number of sentences (#s) and words (#w). MSR PKU AS CITYU Train #s 78k 17k 638k 48k #w 2,122k 1,010k 4,904k 1,310k Dev #s 8.7k 1.9k 71k 5.3k #w 246k 100k 545k 146k Test #s 4.0k 1.9k 14k 1.4k #w 106k 104k 123k 41k |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments (e.g., GPU models, CPU types, or cloud instance specifications). |
| Software Dependencies | No | The paper mentions using a baseline segmenter and an open-source NPYLM segmenter but does not provide specific version numbers for any software dependencies. |
| Experiment Setup | Yes | Table 4: Hyper-parameters of the baseline model. Character embedding size 100 Word embedding size 50 Hidden unit number 50 Margin loss discount 0.2 Maximum word length 6 Decoding beam size 1 |