reproducibilityindex.ai

Generalised Brown Clustering and Roll-Up Feature Generation

Authors: Leon Derczynski, Sean Chester

AAAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Table 2 presents extrinsic results for decoupling a and \|C\|. We measure F1 at the CoNLL 03 task’s test-B set, using a linear-chain CRF and shearing at depths 4, 6, 10 and 20 as the only features, evaluating with CRFsuite at token level.
Researcher Affiliation	Academia	Leon Derczynski University of Shefﬁeld 211 Portobello, Shefﬁeld S1 4DP United Kingdom leon@dcs.shef.ac.uk Sean Chester NTNU Aarhus Universitet Sem Saelandsvei 9 Abogade 34 7491 Trondheim, Norway 8200 Aarhus N, Denmark sean.chester@idi.ntnu.no
Pseudocode	Yes	Algorithm 1 Brown clustering as proposed by Brown et al. Algorithm 2 Generalised Brown clustering
Open Source Code	Yes	Software for Generalised Brown clustering and roll-up feature generation is available freely at http://dx.doi.org/10.5281/zenodo.33758 (Chester and Derczynski 2015).
Open Datasets	Yes	We measure F1 at the CoNLL 03 task’s test-B set, using a linear-chain CRF... using a computationally feasible subset of the Brown corpus (Francis and Kucera 1979) with 12k tokens and 3.7k word types... Using the RCV1 corpus cleaned as per Liang (2005)
Dataset Splits	No	The paper mentions test data but does not explicitly provide details about training/validation splits (percentages, sample counts, or cross-validation setup).
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions software like "CRFsuite" and "conlleval.pl" but does not specify version numbers for these dependencies.
Experiment Setup	Yes	Table 2 presents extrinsic results for decoupling a and \|C\|. We measure F1 at the CoNLL 03 task’s test-B set, using a linear-chain CRF and shearing at depths 4, 6, 10 and 20 as the only features, evaluating with CRFsuite at token level... using CRFsuite with stochastic gradient descent, and evaluating with conlleval.pl at chunk level... with a = 2560 as per Derczynski, Chester, and Bøgh (2015), we shear the tree for each bitdepth in l = {4, 6, 10, 20} as per Ratinov and Roth (2009) and others in later literature.