BitcoinHeist: Topological Data Analysis for Ransomware Prediction on the Bitcoin Blockchain

Authors: Cuneyt G. Akcora, Yitao Li, Yulia R. Gel, Murat Kantarcioglu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental By capitalizing on the recent advances in Topological Data Analysis, we propose a novel efficient and tractable framework to automatically predict new ransomware transactions in a ransomware family, given only limited records of past transactions. Moreover, our new methodology exhibits high utility to detect emergence of new ransomware families, that is, detecting ransomware with no past records of transactions.
Researcher Affiliation Academia Cuneyt G. Akcora1 , Yitao Li2 , Yulia R. Gel3 and Murat Kantarcioglu3 1University of Manitoba, Canada 2Purdue University, USA 3University of Texas at Dallas, USA
Pseudocode Yes Algorithm 1 TDA filtering with multiple attributes.
Open Source Code No The paper states it uses the 'TDAMapper RStats package' but does not provide open-source code for its own developed methodology or features. It references an external package they used.
Open Datasets Yes Datasets of these three studies [Montreal, Princeton, Padua] are publicly available.
Dataset Splits Yes For t < t , use a training length l, and create a dataset Xt which holds features and labels of addresses observed between times t l and t... Using the ground truth data at t , take a sample of M = 1000 white (i.e., f0) addresses without replacement: X0 t .
Hardware Specification No The paper does not specify any hardware details (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No For TDA computations, we use the TDAMapper RStats package (https://github.com/paultpearson/TDAmapper) with parameters overlap=40 and interval = 80. The specific version number of the RStats package is not provided.
Experiment Setup Yes In all models, we report the optimal parameters that maximize F1 scores in predictions: In DBSCAN, we experimented with ϵ = 0.05,...,1 values. Random Forest uses ntree=500 and mtry=|Xt|/3. XGBoost uses the gbtree booster and nrounds = 25. For TDA computations, we use the TDAMapper RStats package (https://github.com/paultpearson/TDAmapper) with parameters overlap=40 and interval = 80.