A Monte Carlo Tree Search approach to Active Malware Analysis
Authors: Riccardo Sartea, Alessandro Farinelli
IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our solution using clustering techniques on models generated by analyzing real malware samples. Results show that our approach learns faster than existing techniques even without any prior information on the samples. |
| Researcher Affiliation | Academia | Riccardo Sartea University of Verona Department of Computer Science riccardo.sartea@univr.it Alessandro Farinelli University of Verona Department of Computer Science alessandro.farinelli@univr.it |
| Pseudocode | Yes | Algorithm 1 Monte Carlo Analysis; Algorithm 2 Default Policy |
| Open Source Code | No | The paper does not provide any links to its source code or state that the code for their proposed methodology is open-source. |
| Open Datasets | Yes | The malware samples have been downloaded from [Xi an Jiaotong University, 2011], for a total of 40 samples, 10 for each family. http://sanddroid.xjtu.edu.cn:8080 |
| Dataset Splits | No | The paper mentions repeating analysis and clustering 10 times and using 40 samples from different families but does not provide specific training, validation, or test dataset splits (e.g., percentages or sample counts). |
| Hardware Specification | No | The paper mentions using an "Android emulator" and setting a computational limit based on "emulator boot time" and "installation of the malware sample on the guest machine," but it does not specify any hardware details like CPU, GPU, or memory of the machine running the emulator. |
| Software Dependencies | No | The analysis environment is based on the Cuckoo sandbox [Cuckoo Foundation, 2016], specifically modified to meet the requirements of AMA. The paper mentions Cuckoo sandbox but does not provide a specific version number for it or any other software dependencies. |
| Experiment Setup | Yes | We tested different game lengths from 1 to 10 (figure 3); We used Cp = 1/ 2, obtaining a good balance between exploitation of actions that are known to trigger malware responses, and exploration of actions that have unknown outcome; In our experiments we set the computational limit of MCTS to fit the Android emulator boot time, plus the time for the installation of the malware sample on the guest machine (about 30s in total for each analyzer action); We applied K-Means clustering, repeating analysis and clustering 10 times, and computing results as the average in terms of purity, inverse purity and f-score w.r.t. our ground truth. |