reproducibilityindex.ai

Seed-Guided Fine-Grained Entity Typing in Science and Engineering Domains

Authors: Yu Zhang, Yunyi Zhang, Yanzhen Shen, Yu Deng, Lucian Popa, Larisa Shwartz, ChengXiang Zhai, Jiawei Han

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on two datasets covering four domains demonstrate the effectiveness of SETYPE in comparison with various baselines.
Researcher Affiliation	Collaboration	1Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA 2IBM Thomas J. Watson Research Center, Yorktown Heights, NY, USA 3IBM Almaden Research Center, San Jose, CA, USA
Pseudocode	No	The paper describes the proposed framework and its phases in text and with a diagram (Figure 1), but it does not contain any formal pseudocode or algorithm blocks.
Open Source Code	Yes	Code and data are available at: https://github.com/yuzhimanhua/SEType.
Open Datasets	Yes	We use two publicly available datasets from software engineering and security domains Stack Overflow NER (Tabassum et al. 2020) and Cybersecurity (Bridges et al. 2013). 1https://github.com/jeniyat/Stack Overflow NER 2https://github.com/stucco/auto-labeled-corpus
Dataset Splits	Yes	In the original dataset, Stack Overflow question-answer threads are split into training, validation, and testing sets, while Git Hub issue reports form a testing set only. ... We take the training and validation corpora of Stack Overflow, remove their annotations, and treat them as unlabeled corpora to create pseudo-labeled training and validation sets, respectively. ... For the larger NVD corpus, we take 20% as the annotated testing data, and the remaining 80% are treated as unlabeled text to create pseudo-labeled training and validation data.
Hardware Specification	Yes	The model is trained on one NVIDIA RTX A6000 GPU.
Software Dependencies	No	The paper mentions using BERTOverflow as the PLM and Adam W optimizer, but does not provide specific version numbers for these or other software dependencies like Python or PyTorch.
Experiment Setup	Yes	During model training, the window size of context sentences c = 1; the maximum premise length is 462 tokens; the maximum hypothesis length is 50 tokens; the training batch size is 4; we use the Adam W optimizer (Loshchilov and Hutter 2019), warm up the learning rate for the first 100 steps and then linearly decay it, where the learning rate is 5e-5; the weight decay is 0.01, and ϵ = 1e-8.