Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Reliable Data Generation and Selection for Low-Resource Relation Extraction
Authors: Junjie Yu, Xing Wang, Wenliang Chen
AAAI 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through experimentation on three datasets with low-resource settings, we demonstrate the effectiveness of our proposed approach in constructing annotated data and achieving noteworthy improvements in comparison to multiple baselines. |
| Researcher Affiliation | Collaboration | Junjie Yu1, Xing Wang2, Wenliang Chen1* 1School of Computer Science and Technology, Soochow University, Suzhou, China 2Tencent AI Lab, Shenzhen, China EMAIL, EMAIL, EMAIL |
| Pseudocode | Yes | Algorithm 1: Sentence Selection and Training Input: seed train Dseed, triplets T and Generator Mg. Hyper Parameter: number of sentences generated for each triplet K. Output: Selected Data Dsel and Relation Extractor Mre. |
| Open Source Code | Yes | Code, data and models are available at https://github.com/jjyunlp/Generation_RE. |
| Open Datasets | Yes | To verify our Self-RDGS approach, we conduct experiments on three datasets, including two human-annotated datasets and one DS-annotated dataset. Sem Eval A human-annotated dataset from Sem Eval-2010 Task 8 (Hendrickx et al. 2010)... Re-TACRED A revised version of the human-annotated dataset TACRED (Zhang et al. 2017) proposed by (Stoica, Platanios, and P oczos 2021) . NYT10m An updated version of the widely used DS dataset NYT10 (Riedel, Yao, and Mc Callum 2010)... |
| Dataset Splits | Yes | To enhance the realistic of low-resource scenarios, we do not create a separate validation set in our approach. Instead, the seed data serves as the validation set while the automatically generated sentences serves as training data. |
| Hardware Specification | No | The paper does not provide specific hardware details (like GPU models, CPU models, or memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions models like GPT-2large, LLa Ma2-7B-chat, Chat GLM2-6B, and BERTbase, but does not provide specific version numbers for these or other software dependencies. |
| Experiment Setup | Yes | Relation Extraction Training We utilize BERTbase (Devlin et al. 2018) to build the RE models. Throughout the training process, we set the learning rate to 5e-5 and maintain a batch size of 32, according to the performance on the validation set. The model is trained for a maximum of 20 epochs, and early stopping is determined by the validation performance. |