Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Neural Character-Level Syntactic Parsing for Chinese

Authors: Zuchao Li, Junru Zhou, Hai Zhao, Zhisong Zhang, Haonan Li, Yuqi Ju

JAIR 2022 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we evaluate our models on the Chinese Penn Treebank (CTB) and our published Shanghai Jiao Tong University Chinese Character Dependency Treebank (SCDT). The results show the eﬀectiveness of our model on both constituent and dependency parsing. We further provide empirical analysis and suggest several directions for future study. 5. Experiments of Empirical Evaluation
Researcher Affiliation	Academia	Zuchao Li EMAIL Junru Zhou EMAIL Hai Zhao EMAIL Department of Computer Science and Engineering Shanghai Jiao Tong University, Shanghai, China Zhisong Zhang EMAIL Language Technologies Institute Carnegie Mellon University, Pittsburg, USA Haonan Li EMAIL School of Computing and Information Systems the University of Melbourne, Melbourne, Australia Yuqi Ju EMAIL Department of Computer Science and Engineering Shanghai Jiao Tong University, Shanghai, China
Pseudocode	Yes	Algorithm 1 Constituent-Construction(u) u is a node of character-level dependency tree. if u is a leaf of the character-level dependency tree then constructing constituent leaf node leaf-node using u return leaf-node end if children-list = [] for each child v on the left of u do left-node = Constituent-Construction(v) putting left-node into children-list end for constructing constituent leaf node leaf-node using u putting leaf-node into children-list for each child v on the right of u do right-node = Constituent-Construction(v) putting right-node into children-list end for constructing constituent node constituent-node using children-list return constituent-node
Open Source Code	Yes	The code is available at https://github.com/bcmi220/ccharpar.
Open Datasets	Yes	Finally, we evaluate our models on the Chinese Penn Treebank (CTB) and our published Shanghai Jiao Tong University Chinese Character Dependency Treebank (SCDT). We manually annotated such character-level dependencies for all the words in the Chinese Penn Treebank (CTB-7.0). We name the resulting corpus as SJTU Chinese Character Dependency Treebank (SCDT)6. 6. http://bcmi.sjtu.edu.cn/~zebraform/. For Chinese, character-level constituents and dependencies structures have been previously explored by Zhang et al. (2013) and Zhang et al. (2014), respectively, whose annotations have also been publicly available7. 7. https://github.com/zhangmeishan/ACL2013-CharParsing, https://github.com/zhangmeishan/ACL2014CharDep
Dataset Splits	Yes	We use Chinese Penn Treebank 5.1 (CTB5) for both constituent and dependency evaluation with articles 001-270 and 440-1151 for training, articles 301-325 as the development set and articles 271-300 for the test set in constituent parsing evaluation following the standard split seen in Liu and Zhang (2017b); with articles 001-815 and 1001-1136 for training, articles 886-931, 1148-1151 as the development set, and articles 816-885 and 1137-1147 for the test set in dependency parsing evaluation following the standard split seen in Zhang and Clark (2008).
Hardware Specification	Yes	All parse models are trained for up to 150 epochs on a single NVIDIA GeForce GTX TITAN X GPU with Intel i7-7800X CPU.
Software Dependencies	No	The paper mentions "Adam optimizer" and "open source Fairseq" but does not provide specific version numbers for these or other key software components used for their implementation. It links to pre-trained models, but these are models, not the general software environment with versions.
Experiment Setup	Yes	In our experiments, the dimension size of character-embedding is 100, we use pre-trained structured-skipgram (Ling et al., 2015) embeddings to initialize our character embeddings. For the self-attention encoder, we apply 4 layers for the POS self-attention encoder and 8 layers for the syntactic self-attention encoder, keeping other hyperparameters settings are the same as Kitaev and Klein (2018). For span scores, we apply feed-forward networks with a hidden layer size of 250. For the dependency biaffine scorer, we employ two 1024-dimensional MLP layers with ReLU as the activation function and a 1024-dimensional parameter matrix for biaffine attention. Training Details We use dropout of 0.33 for biaffine attention and MLP layers. Adam optimizer with initial learning rate 5e-3 and warmup steps 160 is employed for optimization. All parse models are trained for up to 150 epochs on a single NVIDIA GeForce GTX TITAN X GPU with Intel i7-7800X CPU. In the text classification experiment, the initial learning rate is set to 2e-5, and a maximum of 10 epochs are trained.