None Class Ranking Loss for Document-Level Relation Extraction
Authors: Yang Zhou, Wee Sun Lee
IJCAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate that our method significantly outperforms existing multilabel losses for document-level RE and works well in other multi-label tasks such as emotion classification when none class instances are available for training. ... 4 Experiments In this section, we evaluate NCRL on two document-level RE datasets. |
| Researcher Affiliation | Academia | Yang Zhou , Wee Sun Lee School of Computing, National University of Singapore {zhouy, leews}@comp.nus.edu.sg |
| Pseudocode | No | The paper describes the loss functions and mathematical formulations but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our code is available at https://github.com/yangzhou12/NCRL. |
| Open Datasets | Yes | Datasets. Doc RED [Yao et al., 2019] is a large-scale document-level RE dataset, which is constructed from Wikipedia articles. ... Dialog RE [Yu et al., 2020] is a dialogue-based RE dataset... Go Emotions [Demszky et al., 2020] is an emotion classification dataset... |
| Dataset Splits | Yes | Doc RED ... where 3053 documents are used for training, 1000 for development, and 1000 for testing. Dialog RE ... where 60% of dialogues are used for training, 20% for development, and 20% for testing. Go Emotions ... where 80% data are used for training, 10% for development, and 10% for testing. |
| Hardware Specification | Yes | All the experiments are conducted with 1 Ge Force RTX 3090 GPU. |
| Software Dependencies | No | The paper mentions 'Huggingface s Transformers' but does not specify a version number for this or any other software dependency. |
| Experiment Setup | Yes | We use Adam W [Loshchilov and Hutter, 2019] as the optimizer with learning rates {1e 5, 2e 5, . . . 5e 5}, and apply a linear warmup [Goyal et al., 2017] at the first 10% steps followed by a linear decay to 0. The number of training epochs is selected from {5, 8, 10, 20, 30}. ... For NCRL, the hyper-parameter γ in margin shifting (6) are selected from {0, 0.01, 0.05}. |