reproducibilityindex.ai

Effective Modeling of Encoder-Decoder Architecture for Joint Entity and Relation Extraction

Authors: Tapas Nayak, Hwee Tou Ng8528-8535

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments on the publicly available New York Times corpus show that our proposed approaches outperform previous work and achieve signiﬁcantly higher F1 scores.
Researcher Affiliation	Academia	Tapas Nayak, Hwee Tou Ng Department of Computer Science National University of Singapore nayakt@u.nus.edu, nght@comp.nus.edu.sg
Pseudocode	No	The paper describes the model architecture and algorithms textually and with diagrams (Figure 1), but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	1The code and data of this paper can be found at https://github.com/nusnlp/Ptr Net Decoding4JERE
Open Datasets	Yes	We choose the New York Times (NYT) corpus for our experiments. This corpus has multiple versions, and we choose the following two versions... (i) The ﬁrst version is used by Zeng et al. (2018) (mentioned as NYT in their paper) and has 24 relations. We name this version as NYT24. (ii) The second version is used by Takanobu et al. (2019) (mentioned as NYT10 in their paper) and has 29 relations. We name this version as NYT29. Experiments on the publicly available New York Times corpus
Dataset Splits	Yes	We select 10% of the original training data and use it as the validation dataset. The remaining 90% is used for training.
Hardware Specification	No	The paper mentions 'GPU memory' and 'GPU conﬁguration' but does not provide specific hardware details such as GPU models, CPU types, or memory amounts used for experiments.
Software Dependencies	No	The paper mentions tools and optimizers like 'Word2Vec' and 'Adam' but does not provide specific version numbers for software dependencies or libraries used for implementation (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup	Yes	We set the word embedding dimension dw = 300, relation embedding dimension dr = 300, character embedding dimension dc = 50, and character-based word feature dimension df = 50. To extract the character-based word feature vector, we set the CNN ﬁlter width at 3 and the maximum length of a word at 10. The hidden dimension dh of the decoder LSTM cell is set at 300 and the hidden dimension of the forward and the backward LSTM of the encoder is set at 150. The hidden dimension of the forward and backward LSTM of the pointer networks is set at dp = 300. The model is trained with mini-batch size of 32 and the network parameters are optimized using Adam (Kingma and Ba 2015). Dropout layers with a dropout rate ﬁxed at 0.3 are used in our network to avoid overﬁtting.