Kernelized Hashcode Representations for Relation Extraction
Authors: Sahil Garg, Aram Galstyan, Greg Ver Steeg, Irina Rish, Guillermo Cecchi, Shuyang Gao6431-6440
AAAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate the proposed approach on biomedical relation extraction datasets, and observe significant and robust improvements in accuracy w.r.t. state-ofthe-art classifiers, along with drastic (orders-of-magnitude) speedup compared to conventional kernel methods. |
| Researcher Affiliation | Collaboration | 1USC Information Sciences Institute, Marina del Rey, CA USA 2IBM Thomas J. Watson Research Center, Yorktown Heights, NY USA |
| Pseudocode | Yes | Algorithm 1 Optimizing Reference Set for KLSH |
| Open Source Code | Yes | See our code here: github.com/sgarg87/HFR. |
| Open Datasets | Yes | We evaluate our model KLSH-RF (kernelized localitysensitive hashing with random forest) for the biomedical relation extraction task using four public datasets, AIMed, Bio Infer, Pub Med45, Bio NLP, as briefed below... Pub Med45 dataset is available here: github.com/sgarg87/ big mech isi gg/tree/master/pubmed45 dataset; the other three datasets are here: corpora.informatik.hu-berlin.de |
| Dataset Splits | Yes | For tuning any other parameters in our model or competitive models, including the choice of a kernel similarity function (PK or GK), we use 10% of training data, sampled randomly, for validation purposes. |
| Hardware Specification | Yes | We employ 4 cores on an i7 processor, with 16GB memory. |
| Software Dependencies | No | The paper does not specify version numbers for any software dependencies (e.g., specific libraries or frameworks). |
| Experiment Setup | Yes | From a preliminary tuning, we set parameters, H = 1000, R = 250, η = 30, α = 2, and choose RMM as the KLSH technique from the three choices discussed in Sec. 2.1; same parameter values are used across all the experiments unless mentioned otherwise. When optimizing SR with Alg. 1, we use β=1000, γ=300 (sampling parameters are easy to tune). |