GNNDelete: A General Strategy for Unlearning in Graph Neural Networks

Authors: Jiali Cheng, George Dasoulas, Huan He, Chirag Agarwal, Marinka Zitnik

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct experiments on seven real-world graphs, showing that GNNDELETE outperforms existing approaches by up to 38.8% (AUC) on edge, node, and node feature deletion tasks, and 32.2% on distinguishing deleted edges from non-deleted ones. Additionally, GNNDELETE is efficient, taking 12.3x less time and 9.3x less space than retraining GNN from scratch on Word Net18.
Researcher Affiliation Collaboration Jiali Cheng1 George Dasoulas*2 Huan He*2 Chirag Agarwal3 Marinka Zitnik2 1University of Massachusetts Lowell 2Harvard University 3Adobe
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks labeled as such.
Open Source Code Yes Code and datasets for GNNDELETE can be found at https://github.com/mims-harvard/GNNDelete.
Open Datasets Yes We use 5 homogeneous graphs: Cora (Bojchevski & Günnemann, 2018), Pub Med (Bojchevski & Günnemann, 2018), DBLP (Bojchevski & Günnemann, 2018), CS (Bojchevski & Günnemann, 2018), OGB-Collab (Hu et al., 2020), and 2 heterogeneous graphs: OGB-Bio KG (Hu et al., 2020), and Word Net18RR (Dettmers et al., 2018).
Dataset Splits Yes We sample 5% of the total edges as the test set (Et) to evaluate the model s performance on link prediction and sample another 5% as validation set for selecting the best model.
Hardware Specification No The paper does not provide specific hardware details such as GPU or CPU models, memory specifications, or cloud computing instance types used for running the experiments.
Software Dependencies No The paper mentions various GNN architectures (e.g., GCN, GAT, GIN, R-GCN, R-GAT) and cites their original papers, but does not provide specific version numbers for software libraries, programming languages, or other ancillary software dependencies necessary for replication.
Experiment Setup Yes To perform edge deletion tasks, we delete a varying proportion of edges in Ed between [0.5%-5.0%] of the total edges, with a step size of 0.5%. For larger datasets such as OGB (Hu et al., 2020), we limit the maximum deletion ratio to 2.5%... W l D = arg min W l D λLl DEC + (1 λ)Ll NI, where λ [0, 1] is a regularization coefficient that balances the trade-off between the two properties, L refers to the distance function. We use Mean Squared Error (MSE) throughout the experiments... For all methods, we use a 2-layer GCN/R-GCN architecture with a trainable entity and relation embeddings with 128, 64, and 32 hidden dimensions trained on three datasets...