None Class Ranking Loss for Document-Level Relation Extraction

Authors: Yang Zhou, Wee Sun Lee

IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results demonstrate that our method significantly outperforms existing multilabel losses for document-level RE and works well in other multi-label tasks such as emotion classification when none class instances are available for training. ... 4 Experiments In this section, we evaluate NCRL on two document-level RE datasets.
Researcher Affiliation Academia Yang Zhou , Wee Sun Lee School of Computing, National University of Singapore {zhouy, leews}@comp.nus.edu.sg
Pseudocode No The paper describes the loss functions and mathematical formulations but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes Our code is available at https://github.com/yangzhou12/NCRL.
Open Datasets Yes Datasets. Doc RED [Yao et al., 2019] is a large-scale document-level RE dataset, which is constructed from Wikipedia articles. ... Dialog RE [Yu et al., 2020] is a dialogue-based RE dataset... Go Emotions [Demszky et al., 2020] is an emotion classification dataset...
Dataset Splits Yes Doc RED ... where 3053 documents are used for training, 1000 for development, and 1000 for testing. Dialog RE ... where 60% of dialogues are used for training, 20% for development, and 20% for testing. Go Emotions ... where 80% data are used for training, 10% for development, and 10% for testing.
Hardware Specification Yes All the experiments are conducted with 1 Ge Force RTX 3090 GPU.
Software Dependencies No The paper mentions 'Huggingface s Transformers' but does not specify a version number for this or any other software dependency.
Experiment Setup Yes We use Adam W [Loshchilov and Hutter, 2019] as the optimizer with learning rates {1e 5, 2e 5, . . . 5e 5}, and apply a linear warmup [Goyal et al., 2017] at the first 10% steps followed by a linear decay to 0. The number of training epochs is selected from {5, 8, 10, 20, 30}. ... For NCRL, the hyper-parameter γ in margin shifting (6) are selected from {0, 0.01, 0.05}.