An Active Learning Approach to Coreference Resolution

Authors: Mrinmaya Sachan, Eduard Hovy, Eric P. Xing

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We evaluate this hypothesis and our algorithms on both entity and event coreference tasks and on two languages.
Researcher Affiliation Academia Mrinmaya Sachan, Eduard Hovy, Eric P. Xing School of Computer Science Carnegie Mellon University {mrinmays, hovy, epxing}@cs.cmu.edu
Pseudocode Yes The problem (maximize J ) can be solved using a variation of the familiar hard-EM solution described in Algorithm 1 for k-medoids where we also update the metric in the M-step. (...) Algorithm 1 Cross-Document Coref Solver(M, ML, CL)
Open Source Code No The paper does not provide an explicit statement about the release of its source code for the described methodology, nor does it provide a direct link to a code repository.
Open Datasets Yes First is the standard cross-document coreference evaluation dataset in English, ACE-2008 [Strassel et al., 2008]. (...) Next, we will also use the (in-document) IC Event Coreference Corpus and setup [Liu et al., 2014] for evaluations on event coreference and a small newswire dataset on Hindi annotated for in-document entity coreference by us.
Dataset Splits Yes To create the Hindi entity coreference dataset, we annotated 91 news-articles (50 train, 15 dev and 26 test) published on October 10, 2004 in two popular Hindi newspapers Amar Ujala and Navbharat Times using Brat [Stenetorp et al., 2012]. (...) Finally, we also created a small annotated dataset of 100 blogs (50 train, 25 dev, 25 test) were annotated for in-doc coreference resolution. (...) The hyper parameters λ, λ , wml and wcl and the threshold for singleton removal are optimized on the development sets using line search.
Hardware Specification No The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory specifications.
Software Dependencies No The paper mentions using "Brat [Stenetorp et al., 2012]" for annotation but does not provide specific version numbers for any software dependencies required to replicate the experiments.
Experiment Setup Yes The hyper parameters λ, λ , wml and wcl and the threshold for singleton removal are optimized on the development sets using line search. (...) For entity coreference in English, we use the pairwise features employed in the Berkeley Coreference System [Durrett et al., 2013]. (...) For event coreference resolution, we use the features in [Liu et al., 2014]. (...) For entity coreference in Hindi, we (quickly) build a small set of local, pairwise similarity features ourselves.