CASIE: Extracting Cybersecurity Event Information from Text

Authors: Taneeya Satyapanich, Francis Ferraro, Tim Finin8749-8757

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We have conducted experiments on each component in the event detection pipeline and the results show that each subsystem performs well.
Researcher Affiliation Academia Taneeya Satyapanich, Francis Ferraro, Tim Finin Computer Science and Electrical Engineering University of Maryland, Baltimore County Baltimore, MD 21250 USA {taneeya1, ferraro, finin}@umbc.edu
Pseudocode No The paper describes its architecture and various neural network components using textual descriptions and block diagrams, but does not include any formal pseudocode or algorithm blocks.
Open Source Code Yes We make our corpus, annotations and code publicly available (Satyapanich 2019a). Satyapanich, T. 2019a. CASIE Repository. https://github.com/Ebiquity/CASIE.
Open Datasets Yes We collected about 5,000 cybersecurity news articles (Cyberwire 2019). These news articles were published in 20172019. About 1,000 of them which mention our five events were annotated by three experienced computer scientists, using majority vote to select the final annotations. We make our corpus, annotations and code publicly available (Satyapanich 2019a).
Dataset Splits Yes We developed CASIE using 8-fold cross validation of 900 articles of training data set, using 100 articles for testing.
Hardware Specification No The paper does not explicitly describe any specific hardware specifications (e.g., GPU models, CPU types, or memory) used for running the experiments.
Software Dependencies No The paper mentions software tools and models like 'Core NLP', 'DBpedia Spotlight', 'Wikidata', 'Word2vec', and 'BERT-Base Uncased model', but does not provide specific version numbers for these or other key software components or libraries.
Experiment Setup Yes We kept all of the word embeddings as input and experimentally found that using the fourth-to-last hidden layer gave the best development performance. Early experiments showed that an attention size of five gave the best performance, which was further improved by using a tanh activation function. The number of nodes of every layer is equal to a half of the number in the previous layer, except for the attention layer (where the output and input sizes are equal) and the CRF layer (where the output size is equal to the number of output classes).