Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Transition-Based Neural Word Segmentation Using Word-Level Features

Authors: Meishan Zhang, Yue Zhang, Guohong Fu

JAIR 2018 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments on several benchmark datasets to thoroughly examine the eﬀectiveness of neural word features. Results show the eﬀectiveness of word and subword level features for neural Chinese word segmentation. With pretrained character and word embeddings, our method achieves state-of-the-art results. In addition, a combination of our neural features and the traditional discrete features results in further improved performance. We conduct a number of experimental analysis for deeper understanding our proposed neural model.
Researcher Affiliation	Academia	Meishan Zhang EMAIL School of Computer Science and Technology, Heilongjiang University, Harbin, China; Yue Zhang EMAIL Westlake University, Hangzhou, China; Guohong Fu EMAIL School of Computer Science and Technology, Heilongjiang University, Harbin, China.
Pseudocode	Yes	Algorithm 1 Beam-search decoding, where Θ is the set of all model parameters. function Decode(c1 cn, Θ) agenda { (φ (empty stack), c1 cn (queue), score=0.0) } for k in 1 n list { } for candidate in agenda new Apply(sep, candidate, ck, Θ) additem(list, new) new Apply( app, candidate, ck, Θ) additem(list, new) agenda Top-B(list, B) best Best Item(agenda) w1 wm Extract Words(best)
Open Source Code	Yes	We make our codes and models publicly available under GPL at https://github.com/zhangmeishan/NNTran Segmentor.
Open Datasets	Yes	We use three benchmark datasets for evaluation, namely CTB6, PKU and MSR. The CTB6 corpus is taken from the Penn Chinese Treebank 6.0, and the PKU and MSR corpora can be obtained from Bake Oﬀ2005 (Emerson, 2005). [...] The Chinese Gigaword corpus (LDC2011T13) is used to pretrain character and word embeddings.
Dataset Splits	Yes	We follow Zhang et al. (2014a), splitting the CTB6 corpus into training, development and testing sections. For the PKU and MSR corpora, only the training and test datasets are speciﬁed and we randomly split 10% of the training sections for development. [...] Table 3 shows the overall statistics of the four datasets.
Hardware Specification	No	The paper does not specify any particular hardware used for running experiments, such as specific CPU or GPU models.
Software Dependencies	No	The paper mentions using the 'word2vec tool' (Mikolov et al., 2013) and 'extended word2vec tool' (Levy and Goldberg, 2014) but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	The hyper-parameter values are tuned according to preliminary results on the development corpus. We set the dimension size of the basic input character embeddings and word embeddings to 50. The dimension sizes of all the hidden layers of the neural model are set to 100. [...] The initial learning rate for Adagrad is set to 0.01, the regularization term in the training objective is set to 10 8, and the value of η in max-margin training is set to 0.2. [...] We train diﬀerent models on the corresponding training datasets for 20 iterations, and select the best iteration model according to their development performances.