reproducibilityindex.ai

Conditional Random Field Autoencoders for Unsupervised Structured Prediction

Authors: Waleed Ammar, Chris Dyer, Noah A. Smith

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We then show competitive results with instantiations of the framework for two canonical tasks in natural language processing: part-of-speech induction and bitext word alignment, and show that training the proposed model can be substantially more efficient than a comparable feature-rich baseline. We evaluate the effectiveness of CRF autoencoders for learning from unlabeled examples in POS induction and word alignment. Fig. 3 compares predictions of the CRF autoencoder model in seven languages to those of a featurized first-order HMM model [3] and a standard (feature-less) first-order HMM, using V-measure [37]. AER for variants of each model (forward, reverse, and symmetrized) are shown in Table 1 (left).
Researcher Affiliation	Academia	Waleed Ammar Chris Dyer Noah A. Smith School of Computer Science Carnegie Mellon University Pittsburgh, PA 15213, USA {wammar,cdyer,nasmith}@cs.cmu.edu
Pseudocode	No	The paper describes the model and learning process using textual descriptions and mathematical equations, but does not include structured pseudocode or algorithm blocks.
Open Source Code	No	The paper mentions 'cdec [13]' as a machine translation system and provides its URL, but this is a third-party tool used by the authors, not the source code for their own methodology. There is no explicit statement or link providing access to the authors' implementation code.
Open Datasets	Yes	We evaluate the effectiveness of CRF autoencoders for learning from unlabeled examples in POS induction and word alignment. We found reconstructing Brown clusters [5] of tokens instead of their surface forms to improve POS induction. We consider an intrinsic evaluation on a Czech-English dataset of manual alignments. For POS induction, the paper mentions evaluation in seven languages and compares to a featurized first-order HMM model [3], implying the use of established benchmark datasets typically used in these tasks and referenced by the provided citations.
Dataset Splits	No	The paper does not provide specific dataset split information (e.g., percentages, sample counts, or explicit mention of a validation set) needed to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies	No	The paper mentions 'Ada Grad [12]' and 'L-BFGS' as optimizers and 'cdec [13]' as a decoder, but does not provide specific version numbers for these or any other software dependencies.
Experiment Setup	Yes	In the experiments below, we apply a squared L2 regularizer for the CRF parameters λ, and a symmetric Dirichlet prior for categorical parameters θ. We experimented with Ada Grad [12] and L-BFGS. In POS induction, \|Y\| is a constant, the number of syntactic classes which we configure to 12 in our experiments. We defer the detailed experimental setup to Appendix A.