reproducibilityindex.ai

A Representation Learning Framework for Multi-Source Transfer Parsing

Authors: Jiang Guo, Wanxiang Che, David Yarowsky, Haifeng Wang, Ting Liu

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By evaluating on the Google universal dependency treebanks (v2.0), our best models yield an absolute improvement of 6.53% in averaged labeled attachment score, as compared with delexicalized multi-source transfer models. We also signiﬁcantly outperform the state-of-the-art transfer system proposed most recently.
Researcher Affiliation	Collaboration	1Center for Social Computing and Information Retrieval Harbin Institute of Technology, Harbin, China 2Center for Language and Speech Processing Johns Hopkins University, Baltimore, USA 3Baidu Inc., Beijing, China
Pseudocode	No	The paper describes algorithms using text and mathematical equations but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions using existing tools (cdec, word2vec) and adopting an implementation from a previous paper (Guo et al. 2015), but it does not provide a link or explicit statement about the source code for the novel framework described in this paper.
Open Datasets	Yes	We use the Google universal treebanks (v2.0) (Mc Donald et al. 2013) for evaluation. The languages we consider include all Indo-European languages presented in the universal treebanks. For both MULTI-SG and MULTI-PROJ, we use the Europarl corpus for EN-{DE, ES, FR, PT, IT, SV} parallel data,4 and the WMT-2011 English news corpora as additional monolingual data.5
Dataset Splits	No	The paper mentions training models and evaluating on test data, but it does not explicitly describe the use of a separate validation set for hyperparameter tuning or early stopping, nor does it provide specific split percentages for such a set.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions using 'cdec', 'word2vec', and 'multi-threaded Brown clustering tool' but does not specify their version numbers or other software dependencies with versions.
Experiment Setup	Yes	We use the cross-entropy loss as objection function, and use mini-batch Ada Grad to train the parser. Considering that in practice when we apply our model to a low-resource language, typically we don t have any development data for parameter tuning. So we simply train our parsing models for 20,000 iterations without early-stopping.