Adversarial Transfer for Named Entity Boundary Detection with Pointer Networks
Authors: Jing Li, Deheng Ye, Shuo Shang
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct Formal Text Formal Text, Formal Text Informal Text and ablation evaluations on five benchmark datasets. Experimental results show that AT-BDRY achieves state-of-the-art transferring performance against recent baselines. and 4 Experiments |
| Researcher Affiliation | Industry | 1Inception Institute of Artificial Intelligence, Abu Dhabi, United Arab Emirates 2Tencent AI Lab, Shenzhen, China |
| Pseudocode | No | The paper presents architectural diagrams and equations (e.g., Figure 1, Figure 2, equations 1-11) but does not include any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statements about open-sourcing the code for the described methodology, nor does it include a link to a code repository. |
| Open Datasets | Yes | We use five popular benchmark datasets to ascertain the effectiveness of AT-BDRY. Because our task is boundary detection, we ignore entity types in all datasets. The statistics of the datasets are reported in Table 1. Co NLL03, Onto Notes5.0 and Wiki Gold are formal text. WNUT16 and WNUT17 are informal text. |
| Dataset Splits | Yes | We randomly leave out 20% of training set, and combine it with development set as annotated target-domain data for these three baselines. and Table 1: Statistics of datasets. Dataset # Sentences #Mentions Train Dev Test Co NLL03 14,987 3,466 3,684 34,841 Onto Notes5.0 59,917 8,528 8,262 71,031 Wiki Gold 144,342 1,696 298,961 WNUT16 2,394 1,000 3,856 5,630 WNUT17 3,394 1,009 1,287 3,890 |
| Hardware Specification | Yes | All neural network models are implemented with Py Torch framework and evaluated on NVIDIA Tesla V100 GPU. |
| Software Dependencies | No | All neural network models are implemented with Py Torch framework and evaluated on NVIDIA Tesla V100 GPU. |
| Experiment Setup | Yes | For all neural network models, we use Glo Ve 300-dimensional pre-trained word embeddings released by Stanford, which are fine-tuned during training. The dimension of character-level representation is 100 and the CNN sliding windows of filters are [2, 3, 4, 5]. The total number of CNN filters is 100. Each bidirectional encoder GRU has a depth of 3 and hidden size of 128. Each decoder GRU has a depth of 3 and hidden size of 256. Note that the encoder GRU is bidirectional and the decoder GRU is unidirectional in our model. Thus, the decoder has twice the hidden size of the encoder. The Adam optimizer was adopted with a learning rate of 0.001, selected from {0.01, 0.001, 0.0001}. We use a dropout of 0.5 after the convolution or recurrent layers. The decay rate is 0.09 and the gradient clip is 5.0. |