Joint Learning of Constituency and Dependency Grammars by Decomposed Cross-Lingual Induction
Authors: Wenbin Jiang, Qun Liu, Thepchai Supnithi
IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We experiment on joint cross-lingual induction of constituency and dependency grammars from English to Chinese. We first verify the effectiveness of the transition-based variant model for constituency parsing. On WSJ treebank, this model achieves accuracy comparable to the classic transition-based model. The joint constituency and dependency grammar induced by the decomposed strategy achieves very significant improvement in both constituency and dependency grammar induction. Section 5: Experiments |
| Researcher Affiliation | Academia | 1Key Laboratory of Intelligent Information Processing, Institute of Computing Technology Chinese Academy of Sciences, China 2ADAPT Centre, School of Computing, Dublin City University, Ireland 3National Electronics and Computer Technology Center, Thailand |
| Pseudocode | Yes | Algorithm 1 K-beam transition-based parsing. |
| Open Source Code | No | No explicit statement or link providing access to the authors' own source code for the described methodology was found. The only URL provided is for a third-party maximum entropy toolkit. |
| Open Datasets | Yes | We first evaluate the performance of the remodeled transition-based parsing algorithm on the Wall Street Journal Treebank (WSJ) [Marcus et al., 1993]... We use FBIS Chinese-English dataset as the bilingual corpus for cross-lingual induction. The accuracy of the induced grammar is evaluated on some portions of the Penn Chinese Treebank (CTB) [Xue et al., 2005]. |
| Dataset Splits | Yes | Table 2: Data partitioning for WSJ and CTB, in unit of section. Treebank Training Developing Testing WSJ 02-21 22 23 1-270 CTB 400-931 301-325 271-300 1001-1151 |
| Hardware Specification | No | No specific hardware details (such as GPU or CPU models, memory, or cloud instance types) used for running the experiments were mentioned in the paper. |
| Software Dependencies | No | The paper mentions using 'the maximum entropy toolkit by Zhang' and 'GIZA++ [Och, 2003]' but does not provide specific version numbers for these software dependencies. |
| Experiment Setup | Yes | We set the gaussian prior as 1.0, the cutoff threshold as 0 (without cutoff), and the maximum training iteration as 100, while leaving other parameters as default values. For the k-beam transition-based parsing algorithm... less improvement can be obtained with k larger than 16. |