End-to-End Bootstrapping Neural Network for Entity Set Expansion
Authors: Lingyong Yan, Xianpei Han, Ben He, Le Sun9402-9409
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results demonstrate substantial improvement of our model over previous ESE approaches. |
| Researcher Affiliation | Academia | 1 Chinese Information Processing Laboratory, 2 State Key Laboratory of Computer Science Institute of Software, Chinese Academy of Sciences, Beijing, China 3 University of Chinese Academy of Sciences, Beijing, China {lingyong2014, xianpei, sunle}@iscas.ac.cn, benhe@ucas.ac.cn |
| Pseudocode | Yes | Algorithm 1 Optimization Algorithm |
| Open Source Code | Yes | Source code is available online5. 5https://github.com/lingyongyan/bootstrapnet |
| Open Datasets | Yes | Datasets: We use two datasets, Co NLL and Onto Notes, constructed by Zupon et al. (2019). Co NLL is constructed from the Co NLL 2003 shared task dataset (Tjong Kim Sang and De Meulder 2003), which contains 4 entity types. Onto Notes is constructed from the Onto Notes datasets (Pradhan et al. 2013) without numerical categories, which finally contains 11 entity types. Zupon et al. (2019) use the n-grams of the size up to 4 tokens on either side of an entity as the patterns and filter out some patterns. |
| Dataset Splits | Yes | To learn our model, we randomly select other 30 entities per category with their labels from each dataset as the development set, and leave the remaining entities as the test set. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments. |
| Software Dependencies | No | The paper mentions 'scikit-learn package' without a specific version number and does not provide other key software components with their versions. |
| Experiment Setup | Yes | For all baselines and our model, we manually select 10 seeds per category with the highest frequency in the datasets and run them for 20 bootstrapping iterations. At each bootstrapping iteration, we add 10 entities and 10 patterns to each category. The number of layers in Bootstrap Encoder is set to 3. |