Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Exploring Segment Representations for Neural Segmentation Models
Authors: Yijia Liu, Wanxiang Che, Jiang Guo, Bing Qin, Ting Liu
IJCAI 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments on two typical segmentation tasks: named entity recognition (NER) and Chinese word segmentation (CWS). Experimental results show that our neural semi-CRF model benefits from representing the entire segment and achieves the stateof-the-art performance on CWS benchmark dataset and competitive results on the Co NLL03 dataset. |
| Researcher Affiliation | Academia | Yijia Liu, Wanxiang Che , Jiang Guo, Bing Qin, Ting Liu Research Center for Social Computing and Information Retrieval Harbin Institute of Technology, China EMAIL |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We release our code at https://github.com/Exp Results/segrep-for-nn-semicrf. |
| Open Datasets | Yes | For NER, we use the Co NLL03 dataset which is widely adopted for evaluating NER models performance. For CWS, we follow previous study and use three Simplified Chinese datasets: PKU and MSR from 2nd SIGHAN bakeoff and Chinese Treebank 6.0 (CTB6). |
| Dataset Splits | Yes | For the PKU and MSR datasets, last 10% of the training data are used as development data as [Pei et al., 2014] does. For CTB6 data, recommended data split is used. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that would be needed for replication. |
| Experiment Setup | Yes | Throughout this paper, we use the same hyper-parameters for different experiments as listed in Table 1. Initial learning rate is set as 0 = 0.1 and updated as t = 0/(1 + 0.1t) on each epoch t. Best training iteration is determined by the evaluation score on development data. |