reproducibilityindex.ai

DeepITE: Designing Variational Graph Autoencoders for Intervention Target Estimation

Authors: Hongyuan Tao, Hang Yu, Jianguo Li

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our extensive testing confirms that Deep ITE not only surpasses 13 baseline methods in the Recall@k metric but also demonstrates expeditious inference times, particularly on large graphs. Moreover, incorporating a modest fraction of labeled data (5-10%) substantially enhances Deep ITE s performance, further solidifying its practical applicability. In this section, we demonstrate the usefulness of Deep ITE on three datasets, comprising one synthetically generated dataset, which provides a controlled environment to test the robustness and scalability of the framework, and two real-world datasets that introduce the complexity of genuine causal systems. We position Deep ITE against 13 state-of-the-art (SOTA) methods, spanning three areas of relevance: Intervention Target Estimation (ITE), Explainable AI (XAI), and Root Cause Analysis (RCA), due to their intertwined nature (see more discussions in Section 2 and Appendix B).
Researcher Affiliation	Industry	Hongyuan Tao Ant Group Hangzhou, China thy.qy@antgroup.com Ant Group Hangzhou, China hyu.hugo@antgroup.com Ant Group Hangzhou, China lijg.zero@antgroup.com
Pseudocode	Yes	The training process is summarized in Algorithm 1.
Open Source Code	Yes	Our source code is available at https://github.com/alipay/Deep ITE.
Open Datasets	Yes	The synthetic data is generated following the method outlined in CI-RCA [4] and in Appendix G.2. Protein Signaling Dataset: The well-known protein signaling dataset, which originates from Sachs et al. [53]... ICASSP-SPGC 20225: The ICASSP-SPGC 2022 dataset [55], derived from active 5G networks...
Dataset Splits	Yes	The dataset is partitioned with an 85:5:10 ratio for training, validation, and testing.
Hardware Specification	Yes	All the training runs on 4 NVIDIA TESLA P100 GPUs with 50GB of VRAM. All the inference runs on a Mac Book Pro 16 inch with a 6-core Intel i7 CPU and 16 GB of RAM.
Software Dependencies	No	The paper mentions 'NAdam' and 'Gumbel-Softmax' as techniques/optimizers used, but does not provide specific version numbers for these or for general software dependencies like Python, PyTorch, or TensorFlow.
Experiment Setup	Yes	Unless otherwise specified, in all of our experiments for Deep ITE, we set the hidden dimension in GAT and MLP to 64. For optimization, we used NAdam [52] with a learning rate 1 10 4. We conducted training for 1000 epochs and select the checkpoints with the lowest training loss. The temperature t for gumbel-softmax [23] is calculated by t = 101 0.2e when epoch e <= 500 and t = 0.5/(e 500) for epoch e > 500.