reproducibilityindex.ai

SAD: Semi-Supervised Anomaly Detection on Dynamic Graphs

Authors: Sheng Tian, Jihai Dong, Jintang Li, Wenlong Zhao, Xiaolong Xu, Baokun Wang, Bowen Song, Changhua Meng, Tianyi Zhang, Liang Chen

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on four real-world datasets demonstrate that SAD efficiently discovers anomalies from dynamic graphs and outperforms existing advanced methods even when provided with only little labeled data.
Researcher Affiliation	Collaboration	1Ant Group 2Sun Yat-sen University
Pseudocode	No	The paper describes its framework and components using text and mathematical equations, but does not include any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	Code is made publicly available at https://github.com/D10Andy/SAD for reproducibility.
Open Datasets	Yes	In this paper, we use four real-world datasets, including three public bipartite interaction dynamic graphs and an industrial dataset. Wikipedia [Kumar et al., 2019] is a dynamic network tracking user edits on wiki pages... Reddit [Kumar et al., 2019] is a dynamic network tracking active users posting in subreddits... MOOC [Kumar et al., 2019] is a dynamic network tracking students actions on MOOC online course platforms...
Dataset Splits	Yes	For all tasks and datasets, we adopt the same chronological split with 70% for training and 15% for validation and testing according to node interaction timestamps.
Hardware Specification	Yes	The proposed method is implemented using Py Torch [Paszke et al., 2019] 1.10.1 and trained on a cloud server with NVIDIA Tesla V100 GPU.
Software Dependencies	Yes	The proposed method is implemented using Py Torch [Paszke et al., 2019] 1.10.1 and trained on a cloud server with NVIDIA Tesla V100 GPU.
Experiment Setup	Yes	Regarding the proposed SAD model, we use a two-layer, two-head TGAT [Xu et al., 2020a] as a graph network encoder to produce 128-dimensional node representations. For model hyperparameters, we fix the following configuration across all experiments without further tuning: we adopt Adam as the optimizer with an initial learning rate of 0.0005 and a batch size of 256 for both training. We adopt mini-batch training for SAD and sample two-hop subgraphs with 20 nodes per hop. For the memory bank, we set the memory size M as 4000, and the sampled size Ms as 1000. To better measure model performance in terms of AUC metrics, we choose the node classification task (similar to anomaly node detection) as the downstream task, so we adopt Eq.(10) as the model optimization objective and find that pick α = 0.1 and β = 0.01 performs well across all datasets.