reproducibilityindex.ai

Exploring Segment Representations for Neural Segmentation Models

Authors: Yijia Liu, Wanxiang Che, Jiang Guo, Bing Qin, Ting Liu

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct extensive experiments on two typical segmentation tasks: named entity recognition (NER) and Chinese word segmentation (CWS). Experimental results show that our neural semi-CRF model beneﬁts from representing the entire segment and achieves the stateof-the-art performance on CWS benchmark dataset and competitive results on the Co NLL03 dataset.
Researcher Affiliation	Academia	Yijia Liu, Wanxiang Che , Jiang Guo, Bing Qin, Ting Liu Research Center for Social Computing and Information Retrieval Harbin Institute of Technology, China {yjliu,car,jguo,qinb,tliu}@ir.hit.edu.cn
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	Yes	We release our code at https://github.com/Exp Results/segrep-for-nn-semicrf.
Open Datasets	Yes	For NER, we use the Co NLL03 dataset which is widely adopted for evaluating NER models performance. For CWS, we follow previous study and use three Simpliﬁed Chinese datasets: PKU and MSR from 2nd SIGHAN bakeoff and Chinese Treebank 6.0 (CTB6).
Dataset Splits	Yes	For the PKU and MSR datasets, last 10% of the training data are used as development data as [Pei et al., 2014] does. For CTB6 data, recommended data split is used.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x) that would be needed for replication.
Experiment Setup	Yes	Throughout this paper, we use the same hyper-parameters for different experiments as listed in Table 1. Initial learning rate is set as 0 = 0.1 and updated as t = 0/(1 + 0.1t) on each epoch t. Best training iteration is determined by the evaluation score on development data.