Heterogeneous Graph Matching Networks for Unknown Malware Detection
Authors: Shen Wang, Zhengzhang Chen, Xiao Yu, Ding Li, Jingchao Ni, Lu-An Tang, Jiaping Gui, Zhichun Li, Haifeng Chen, Philip S. Yu
IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct a systematic evaluation of our model and show that it is accurate in detecting malicious program behavior and can help detect malware attacks with less false positives. Match GNet outperforms the state-of-the-art algorithms in malware detection by generating 50% less false positives while keeping zero false negatives. |
| Researcher Affiliation | Collaboration | 1University of Illinois at Chicago, USA 2NEC Laboratories America, USA 3Tsinghua University, China |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | We collect a 20-week period of data from a real enterprise network composed of 109 hosts (87 Windows hosts and 22 Linux hosts). The sheer size of the data set is around three terabytes. |
| Dataset Splits | Yes | We evaluate the selection of hyper parameters of Match GNet with our validating data set (i.e., data from the sixth week). To simulate unknown program instances, we split the programs in the training data equally into two sets, the known set and the unknown set. In our five weeks training data, we exclude the programs in the unknown set and only train the model from the programs in the known set. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions using 'Adam optimizer' but does not specify version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | We find that when Match GNet has 3 layers and 500 neurons, it reaches the maximal AUC. Larger hyper-parameter values may consume more resources but have little improvement on the AUC. Thus, we use the optimal hyper parameters as a part of the default model and apply them to the other parts of our experiments. |