DeepITE: Designing Variational Graph Autoencoders for Intervention Target Estimation
Authors: Hongyuan Tao, Hang Yu, Jianguo Li
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our extensive testing confirms that Deep ITE not only surpasses 13 baseline methods in the Recall@k metric but also demonstrates expeditious inference times, particularly on large graphs. Moreover, incorporating a modest fraction of labeled data (5-10%) substantially enhances Deep ITE s performance, further solidifying its practical applicability. In this section, we demonstrate the usefulness of Deep ITE on three datasets, comprising one synthetically generated dataset, which provides a controlled environment to test the robustness and scalability of the framework, and two real-world datasets that introduce the complexity of genuine causal systems. We position Deep ITE against 13 state-of-the-art (SOTA) methods, spanning three areas of relevance: Intervention Target Estimation (ITE), Explainable AI (XAI), and Root Cause Analysis (RCA), due to their intertwined nature (see more discussions in Section 2 and Appendix B). |
| Researcher Affiliation | Industry | Hongyuan Tao Ant Group Hangzhou, China thy.qy@antgroup.com Ant Group Hangzhou, China hyu.hugo@antgroup.com Ant Group Hangzhou, China lijg.zero@antgroup.com |
| Pseudocode | Yes | The training process is summarized in Algorithm 1. |
| Open Source Code | Yes | Our source code is available at https://github.com/alipay/Deep ITE. |
| Open Datasets | Yes | The synthetic data is generated following the method outlined in CI-RCA [4] and in Appendix G.2. Protein Signaling Dataset: The well-known protein signaling dataset, which originates from Sachs et al. [53]... ICASSP-SPGC 20225: The ICASSP-SPGC 2022 dataset [55], derived from active 5G networks... |
| Dataset Splits | Yes | The dataset is partitioned with an 85:5:10 ratio for training, validation, and testing. |
| Hardware Specification | Yes | All the training runs on 4 NVIDIA TESLA P100 GPUs with 50GB of VRAM. All the inference runs on a Mac Book Pro 16 inch with a 6-core Intel i7 CPU and 16 GB of RAM. |
| Software Dependencies | No | The paper mentions 'NAdam' and 'Gumbel-Softmax' as techniques/optimizers used, but does not provide specific version numbers for these or for general software dependencies like Python, PyTorch, or TensorFlow. |
| Experiment Setup | Yes | Unless otherwise specified, in all of our experiments for Deep ITE, we set the hidden dimension in GAT and MLP to 64. For optimization, we used NAdam [52] with a learning rate 1 10 4. We conducted training for 1000 epochs and select the checkpoints with the lowest training loss. The temperature t for gumbel-softmax [23] is calculated by t = 101 0.2e when epoch e <= 500 and t = 0.5/(e 500) for epoch e > 500. |