reproducibilityindex.ai

Kernelized Hashcode Representations for Relation Extraction

Authors: Sahil Garg, Aram Galstyan, Greg Ver Steeg, Irina Rish, Guillermo Cecchi, Shuyang Gao6431-6440

AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate the proposed approach on biomedical relation extraction datasets, and observe signiﬁcant and robust improvements in accuracy w.r.t. state-ofthe-art classiﬁers, along with drastic (orders-of-magnitude) speedup compared to conventional kernel methods.
Researcher Affiliation	Collaboration	1USC Information Sciences Institute, Marina del Rey, CA USA 2IBM Thomas J. Watson Research Center, Yorktown Heights, NY USA
Pseudocode	Yes	Algorithm 1 Optimizing Reference Set for KLSH
Open Source Code	Yes	See our code here: github.com/sgarg87/HFR.
Open Datasets	Yes	We evaluate our model KLSH-RF (kernelized localitysensitive hashing with random forest) for the biomedical relation extraction task using four public datasets, AIMed, Bio Infer, Pub Med45, Bio NLP, as briefed below... Pub Med45 dataset is available here: github.com/sgarg87/ big mech isi gg/tree/master/pubmed45 dataset; the other three datasets are here: corpora.informatik.hu-berlin.de
Dataset Splits	Yes	For tuning any other parameters in our model or competitive models, including the choice of a kernel similarity function (PK or GK), we use 10% of training data, sampled randomly, for validation purposes.
Hardware Specification	Yes	We employ 4 cores on an i7 processor, with 16GB memory.
Software Dependencies	No	The paper does not specify version numbers for any software dependencies (e.g., specific libraries or frameworks).
Experiment Setup	Yes	From a preliminary tuning, we set parameters, H = 1000, R = 250, η = 30, α = 2, and choose RMM as the KLSH technique from the three choices discussed in Sec. 2.1; same parameter values are used across all the experiments unless mentioned otherwise. When optimizing SR with Alg. 1, we use β=1000, γ=300 (sampling parameters are easy to tune).