Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning Greedy Policies for the Easy-First Framework
Authors: Jun Xie, Chao Ma, Janardhan Rao Doppa, Prashanth Mannem, Xiaoli Fern, Thomas G. Dietterich, Prasad Tadepalli
AAAI 2015 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach in two NLP domains: within-document entity coreference resolution and cross-document joint entity and event coreference resolution. Our results demonstrate that the proposed approach achieves statistically significant performance improvement over the baseline training approaches for the Easy-first framework and is less prone to overfitting. |
| Researcher Affiliation | Academia | School of Electrical Engineering and Computer Science, Oregon State University EMAIL School of Electrical Engineering and Computer Science, Washington State University EMAIL |
| Pseudocode | Yes | Algorithm 1 Easy-first inference algorithm with learning option. ... Algorithm 2 The MM algorithm to solve Equation 4 |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | ACE04: (NIST 2004) We employ the same train-ing/testing partition as ACE2004-CULOTTA-TEST (Culotta et al. 2007; Bengtson and Roth 2008). ... Onto Notes-5.0: (Pradhan et al. 2012) We employ the of-ficial split for training, validation, and testing. ... We employ the benchmark EECB corpus (Lee et al. 2012) for our experiments. |
| Dataset Splits | Yes | Onto Notes-5.0: (Pradhan et al. 2012) We employ the of-ficial split for training, validation, and testing. There are 2802 documents for training; 343 documents for validation; and 345 documents for testing. ... ACE04: (NIST 2004) ... 268 documents are used for training, 68 documents for validating, and 107 documents for testing. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions external tools like 'Stanford multi-pass sieve system' and 'Co NLL scorer (7.0)' but does not provide specific version numbers for its own implementation's software dependencies (e.g., programming languages or libraries with versions). |
| Experiment Setup | Yes | For BGBB, we tune the learning rate (η {10 1, ..., 10 5}) and the maximum number of repeated perceptron updates (k {1, 5, 10, 20, 50}) for each mistake step. For RBGVB and RBGBB, we tune the regularization parameter (λ {10 4, 10 3, ..., 103}). For MM-based method including BGVB, RBGVB, RBGBB, we tune the maximum number of MM iterations (T {1, 5, 10, 20, 50}) and the maximum number of gradient descent steps (t {1, 5, 10, 20, 50}). |