reproducibilityindex.ai

Early Discovery of Emerging Entities in Microblogs

Authors: Satoshi Akasaki, Naoki Yoshinaga, Masashi Toyoda

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results with a large-scale Twitter archive show that the proposed method achieves 83.2% precision of the top 500 discovered emerging entities, which outperforms baselines based on unseen entity recognition with burst detection.
Researcher Affiliation	Academia	1The University of Tokyo 2Institute of Industrial Science, the University of Tokyo
Pseudocode	No	The paper does not contain structured pseudocode or algorithm blocks, nor does it have clearly labeled algorithm sections or code-like formatted procedures.
Open Source Code	No	We will release all the datasets (tweet IDs)1 used in experiments to promote the reproducibility. (Footnote 1: http://www.tkl.iis.u-tokyo.ac.jp/ akasaki/ijcai-19/) - The paper explicitly states releasing datasets, not the source code for the methodology.
Open Datasets	Yes	We collected titles of articles that were registered in the Japanese version of Wikipedia from March 11th, 2012 to December 31st, 2015 using the Wikipedia dump on June 20th, 2018. and We will release all the datasets (tweet IDs)1 used in experiments to promote the reproducibility. (Footnote 1: http://www.tkl.iis.u-tokyo.ac.jp/ akasaki/ijcai-19/)
Dataset Splits	Yes	For model selection, we used 10% of the training data as the development data.
Hardware Specification	No	The paper does not provide specific hardware details such as exact GPU/CPU models, processor types, or memory amounts used for running its experiments.
Software Dependencies	Yes	We used the implementation using MALLET (ver. 2.0.6) [Mc Callum, 2002]... We used the implementation using Theano (ver. 0.9.0) provided by [Lample et al., 2016]... We tokenized each example by using Me Cab (ver. 0.996)3 with ipadic dictionary (ver. 2.7.0)... We used Cabo Cha (https://taku910.github.io/cabocha/).
Experiment Setup	Yes	We therefore empirically set the parameters to k = 5, n = 100 and k = 10. and The hyperparameter C was tuned to 0.125 using the development data. and We optimized the model using stochastic gradient descent and chose the model at the epoch with the highest F1 on the development data.