Partitioning Message Passing for Graph Fraud Detection

Authors: Wei Zhuo, Zemin Liu, Bryan Hooi, Bingsheng He, Guang Tan, Rizal Fathony, Jia Chen

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experimental results show that PMP can significantly boost the performance on GFD tasks. Our code is available at https://github.com/Xtra-Computing/PMP. 1 INTRODUCTION With the explosive growth of online information, fraudulent activities have significantly increased in financial networks (Ngai et al., 2011; Lin et al., 2021), social media (Deng et al., 2022), review networks (Rayana & Akoglu, 2015), and academic networks (Cho et al., 2021), making the detection of such activities an area of paramount importance. To fully exploit the rich graph structures contained in fraud graphs, recent studies have increasingly adopted Graph Neural Networks (GNNs) (Wu et al., 2020) to Graph Fraud Detection (GFD).
Researcher Affiliation Collaboration Wei Zhuo1, Zemin Liu2, Bryan Hooi2, Bingsheng He2 , Guang Tan1 , Rizal Fathony3, Jia Chen3 1Shenzhen Campus of Sun Yat-sen University, 2National University of Singapore, 3Grab Taxi Holdings Pte. Ltd.
Pseudocode Yes A ALGORITHMIC DETAILS Algorithm 1 PMP forward propatation Input: Fraud graph G = (V, Er, X, Y); Depth L; Batch size B; Output: Logits Z RN 1: for l {1, , L} do 2: for each batch G of size B do 3: for vi batch do 4: bh(l) i f (l 1) self (h(l 1) i ) 5: α(l 1) i Φ(h(l 1) i ) 6: W(l 1) fr,i Ψfr(h(l 1) i ); W(l 1) be,i = Ψbe(h(l 1) i ); W(l 1) un,i = α(l 1) i W(l 1) fr + (1 α(l 1) i )W(l 1) be,i // Generate weight matrices for vi 7: a(l) i = f (l 1) fr,i h(l 1) j |vj Nfr(vi) + f (l 1) be,i h(l 1) j |vj Nbe(vi) + f (l 1) un,i h(l 1) j |vj / Nbe(vi) Nfr(vi) // f (l 1) fr,i , f (l 1) be,i and f (l 1) un,i are parameterized by W(l 1) fr,i , W(l 1) be,i and W(l 1) un,i respectively. 8: h(l) i = bh(l) i + a(l) i 9: end for 10: end for 11: end for 12: e H = MLP(H(L)) RN 1 13: Zi Sigmoid( e H) 14: L = P i (yi log(Zi) + (1 yi) log(1 Zi)) // Cross-entropy loss
Open Source Code Yes Our code is available at https://github.com/Xtra-Computing/PMP.
Open Datasets Yes We evaluate our approach using five datasets tailored for GFD: Yelp (Rayana & Akoglu, 2015), Amazon (Mc Auley & Leskovec, 2013), T-Finance, T-Social (Tang et al., 2022).
Dataset Splits Yes Following (Tang et al., 2022), we adopt the data splitting ratios of 40%:20%:40% for training, validation, and test set in the supervised scenario. In the semi-supervised scenario, the data splitting ratio is 1%:10%:89%.
Hardware Specification Yes For all experiments, we use a single NVIDIA A100 GPU with 80GB GPU memory.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup Yes We provide the detailed hyperparameter tuning strategies of baselines and hyperparameter setting of PMP in Appendix D.4. For each model, we explored the following searching ranges for general hyperparameters: learning rate lr {0.01, 0.005, 0.001}, weight decay wd {0, 5e 5, 1e 4}, dropout do {0, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8}, hidden dimension d {32, 64, 128, 256, 512}. For spatial-based models, batch size is an important hyperparameter that highly depends on the graph size.