Early Discovery of Emerging Entities in Microblogs
Authors: Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results with a large-scale Twitter archive show that the proposed method achieves 83.2% precision of the top 500 discovered emerging entities, which outperforms baselines based on unseen entity recognition with burst detection. |
| Researcher Affiliation | Academia | 1The University of Tokyo 2Institute of Industrial Science, the University of Tokyo |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks, nor does it have clearly labeled algorithm sections or code-like formatted procedures. |
| Open Source Code | No | We will release all the datasets (tweet IDs)1 used in experiments to promote the reproducibility. (Footnote 1: http://www.tkl.iis.u-tokyo.ac.jp/ akasaki/ijcai-19/) - The paper explicitly states releasing datasets, not the source code for the methodology. |
| Open Datasets | Yes | We collected titles of articles that were registered in the Japanese version of Wikipedia from March 11th, 2012 to December 31st, 2015 using the Wikipedia dump on June 20th, 2018. and We will release all the datasets (tweet IDs)1 used in experiments to promote the reproducibility. (Footnote 1: http://www.tkl.iis.u-tokyo.ac.jp/ akasaki/ijcai-19/) |
| Dataset Splits | Yes | For model selection, we used 10% of the training data as the development data. |
| Hardware Specification | No | The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments. |
| Software Dependencies | Yes | We used the implementation using MALLET (ver. 2.0.6) [Mc Callum, 2002]... We used the implementation using Theano (ver. 0.9.0) provided by [Lample et al., 2016]... We tokenized each example by using Me Cab (ver. 0.996)3 with ipadic dictionary (ver. 2.7.0)... We used Cabo Cha (https://taku910.github.io/cabocha/). |
| Experiment Setup | Yes | We therefore empirically set the parameters to k = 5, n = 100 and k = 10. and The hyperparameter C was tuned to 0.125 using the development data. and We optimized the model using stochastic gradient descent and chose the model at the epoch with the highest F1 on the development data. |