reproducibilityindex.ai

Learning Distributed Representations for Structured Output Prediction

Authors: Vivek Srikumar, Christopher D. Manning

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our approach on two tasks which have semantically rich labels: multiclass classiﬁcation on the newsgroup data and part-of-speech tagging for English and Basque. In all cases, we show that DISTRO outperforms the structural SVM baselines. We demonstrate the effectiveness of DISTRO on two tasks document classiﬁcation (purely atomic structures) and part-of-speech (POS) tagging (both atomic and compositional structures). In both cases, we compare to structural SVMs i.e. the case of one-hot label vectors as the baseline. Table 1 reports the performance of the baseline and variants of DISTRO for newsgroup classiﬁcation. Table 2 presents the results for the two languages.
Researcher Affiliation	Academia	Vivek Srikumar University of Utah svivek@cs.utah.edu Christopher D. Manning Stanford University manning@cs.stanford.edu
Pseudocode	Yes	Algorithm 1 Learning algorithm by alternating minimization. The goal is to solve minw,A f(w, A). The input to the problem is a training set of examples consisting of pairs of labeled inputs (xi, yi) and T, the number of iterations.
Open Source Code	No	The paper does not provide an explicit statement about the release of source code or a link to a code repository for the methodology described.
Open Datasets	Yes	We used the bydate version of the data with tokens as features. Table 1 reports the performance of the baseline and variants of DISTRO for newsgroup classiﬁcation. The 20 Newsgroups Dataset [13]. English POS tagging has been long studied using the Penn Treebank data [15]. We used the Basque data from the Co NLL 2007 shared task [17] for training the Basque POS tagger.
Dataset Splits	Yes	We selected the hyper-parameters for all experiments by cross validation. We used the standard train-test split [8, 24] we trained on sections 0-18 of the Treebank and report performance on sections 22-24.
Hardware Specification	No	The paper does not explicitly describe the specific hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies	No	The paper mentions using the 'Stanford NLP pipeline' but does not specify its version or any other software dependencies with version numbers.
Experiment Setup	Yes	We selected the hyper-parameters for all experiments by cross validation. We ran the alternating algorithm for 5 epochs for all cases with 5 epochs of SGD for both the weight and label vectors. We allowed the baseline to run for 25 epochs over the data.