ADA-GAD: Anomaly-Denoised Autoencoders for Graph Anomaly Detection
Authors: Junwei He, Qianqian Xu, Yangbangyan Jiang, Zitai Wang, Qingming Huang
AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We validate the effectiveness of our approach through extensive experiments on both synthetic and real-world datasets. |
| Researcher Affiliation | Academia | 1Key Laboratory of Intelligent Information Processing, Institute of Computing Technology, CAS, Beijing, China 2School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing, China 3Institute of Information Engineering, CAS, Beijing, China 4School of Cyber Security, University of Chinese Academy of Sciences, Beijing, China 5Key Laboratory of Big Data Mining and Knowledge Management, Chinese Academy of Sciences, Beijing, China |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks clearly labeled as 'Algorithm' or 'Pseudocode'. |
| Open Source Code | No | The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper. |
| Open Datasets | Yes | We conducted experiments on two datasets injected with synthetic anomalies: Cora (Sen et al. 2008), Amazon (Shchur et al. 2018), and five manually labeled datasets with anomalies: Weibo (Zhao et al. 2020), Reddit (Kumar, Zhang, and Leskovec 2019), Disney (M uller et al. 2013), Books (S anchez et al. 2013), and Enron (S anchez et al. 2013). |
| Dataset Splits | No | The paper mentions training and testing on datasets but does not explicitly provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology) for training, validation, and testing sets needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper mentions using the PyGOD toolbox, GCN, and GAT, but does not provide specific version numbers for these or any other software dependencies needed to replicate the experiment. |
| Experiment Setup | Yes | We set the number of epochs/dropout rate/weight decay to 100/0.1/0.01, respectively. The embedding dimension d is set to 12 for the Disney, Books, and Enron datasets, and 64 for the others. Our ADA-GAD method utilizes GCN as the encoders and decoders, except for the Enron and Weibo datasets, where we adopt GAT as the encoders and GCN as decoders. For the real-world datasets Disney, Books, and Enron, the encoder depth is set to 2 and the decoder depth is 1. For the other datasets, encoder and decoder depths are set to 1. During augmentation, the number of masks for nodes and edges is set within the range of 1 to 20, respectively. The number of random walks and walk length for the subgraph mask are both set to 2. ln, le, and ls are all set to 10, with θ is assigned to the smallest Gano among N aug random augmentations. In the experiments, N aug is set to 30. The pretraining epoch and the retain epoch are both set to 20. AUC (Area under the ROC Curve) (Bradley 1997) is used as the performance metric. We repeat all experiments 10 times using 10 different seeds. |