reproducibilityindex.ai

Enhancing Low-Resource Relation Representations through Multi-View Decoupling

Authors: Chenghao Fan, Wei Wei, Xiaoye Qu, Zhenyi Lu, Wenfeng Xie, Yu Cheng, Dangyang Chen

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on three benchmark datasets show that our method can achieve state-of-the-art in low-resource settings.
Researcher Affiliation	Collaboration	Chenghao Fan1,2, Wei Wei*1,2, Xiaoye Qu1,2, Zhenyi Lu1,2, Wenfeng Xie3, Yu Cheng4, Dangyang Chen3 1Cognitive Computing and Intelligent Information Processing (CCIIP) Laboratory, School of Computer Science and Technology, Huazhong University of Science and Technology 2Joint Laboratory of HUST and Pingan Property & Casualty Research (HPL) 3Ping An Property & Casualty Insurance Company of China, Ltd. 4The Chinese University of Hong Kong
Pseudocode	No	The paper describes its method using textual explanations and mathematical equations. However, it does not include any clearly labeled pseudocode or algorithm blocks.
Open Source Code	No	The paper does not include any explicit statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	Yes	Datasets For comprehensive experiments, we conduct experiments on three RE datasets: Sem Eval 2010 Task 8 (Sem Eval) (Hendrickx et al. 2010), TACRED (Zhang et al. 2017), and TACRED-Revisit (TACREV) (Alt, Gabryszak, and Hennig 2020). Here we briefly describe them below. The detailed statistics are provided in Table 1.
Dataset Splits	Yes	Table 1: The statistics of different RE datasets. Dataset Train Dev Test Relation Sem Eval 6,507 1,493 2,717 19 TACRED 68,124 22,631 15,509 42 TACREV 68,124 22,631 15,509 42
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments, such as GPU models, CPU types, or memory.
Software Dependencies	No	The paper states 'We utilize Roberta-large for all experiments to make a fair comparison.' While it mentions a pre-trained language model, it does not specify version numbers for any software dependencies, libraries, or frameworks used (e.g., Python version, PyTorch/TensorFlow versions).
Experiment Setup	Yes	Implementation Details We utilize Roberta-large for all experiments to make a fair comparison. For test metrics, we use micro F1 scores of RE as the primary metric to evaluate models, considering that F1 scores can assess the overall performance of precision and recall. Low-Resource Setting. we adopt the same setting as Retrieval RE (Chen et al. 2022a) and perform experiments using 1-, 5-, and 16-shot scenarios to evaluate the performance of our approach in extremely low-resource situations. To avoid randomness, we employ a fixed set of seeds to randomly sample data five times and record the average performance and variance. During the sampling process, we select k instances for each relation label from the original training sets to compose the few-shot training sets. Standard Setting. In the standard setting, we leverage full trainsets to conduct experiments and compare with previous prompt-tuning methods, including PTR, Know Prompt, and Retrieval RE.