Collaboration Based Multi-Label Propagation for Fraud Detection

Authors: Haobo Wang, Zhao Li, Jiaming Huang, Pengrui Hui, Weiwei Liu, Tianlei Hu, Gang Chen

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that the proposed method not only outperforms on ordinary multi-label datasets, but is effective and scalable on large-scale e-commerce dataset. Section 4: Experiments. Table 1: Transductive performance comparison on ordinary multi-label datasets. Table 2: Transductive performance comparison of three graph-based algorithms on Taobao-FUD dataset.
Researcher Affiliation Collaboration 1Key Lab of Intelligent Computing Based Big Data of Zhejiang Province, Zhejiang University 2Alibaba Group, Hangzhou, China 3School of Computer Science, Wuhan University
Pseudocode No The paper describes the algorithms using mathematical equations and descriptive text but does not include a formal pseudocode block or algorithm box.
Open Source Code No The paper does not provide any links to source code or explicitly state that the code for the described methodology is publicly available.
Open Datasets Yes We choose four real-world multi-label datasets from different task domains: 1) Medical [Pestian et al., 2007]: a text dataset... 2) Image [Wang et al., 2019]: a collection of... 3) Slashdot [Read et al., 2009]: a web text dataset... 4) Eurlex-sm [Loza Menc ia and F urnkranz, 2008]: a large text dataset...
Dataset Splits Yes All the datasets are randomly partitioned to 5% labeled data and 95% unlabeled data. We randomly select 5% examples as labeled data and the rest are used to evaluate the transductive performance.
Hardware Specification No The computations are performed on Max Compute platform, a fast, distributed and fully hosted GB/TB/PB level data warehouse solution. We use three computation instances for time comparison and 3000 instances for performance comparison. This mentions a platform and generic instances but no specific hardware components like CPU/GPU models or memory.
Software Dependencies No The paper mentions applying a three-layer neural network and building a k-NN adjacency graph, but it does not specify any software names with version numbers (e.g., Python, TensorFlow, PyTorch versions).
Experiment Setup Yes For our methods, γ is selected from {0.1, 1, 10, 100}. α is chosen from {0.01, 0.05, 0.1, 0.2, 0.5}. β, λ and µ are empirically fixed to 0.1. For Deep Fraud, we apply a three layer neural network with Re LU activation. The hidden size is set as 128. The learning rate and regularization parameter are set as 0.001 and 0.5. When building graphs, k is set as 20.