Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Simultaneously Linking Entities and Extracting Relations from Biomedical Text without Mention-Level Supervision

Authors: Trapit Bansal, Pat Verga, Neha Choudhary, Andrew McCallum7407-7414

AAAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show that our model outperforms a state-of-the art entity linking and relation extraction pipeline on two biomedical datasets and can drastically improve the overall recall of the system.
Researcher Affiliation Collaboration University of Massachusetts, Amherst EMAIL Google Research EMAIL
Pseudocode No The paper describes the model architecture and mathematical formulations, but it does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Code and data: https://github.com/theTB/snerl
Open Datasets Yes Our ๏ฌrst set of experiments are on the CTD dataset ๏ฌrst introduced in Verga, Strubell, and Mc Callum (2018). The data is derived from annotations in the Chemical Toxicology Database (Davis et al. 2018)...
Dataset Splits Yes We remedy this and create a more challenging train/development/test split from the entire CTD annotations... We consider k as a hyperparameter and tune it on the validation set.
Hardware Specification No The paper does not provide specific details about the hardware used for conducting the experiments.
Software Dependencies No The paper mentions using a Transformer architecture, Bio Sent Vec, and Dist Mult, but it does not specify version numbers for these or other software dependencies necessary for replication.
Experiment Setup Yes We train based on the cross-entropy loss... We consider k as a hyperparameter and tune it on the validation set. In summary, we combine graph prediction and document-level entity prediction objectives similar to multitask learning (Caruana 1993).