GNNDelete: A General Strategy for Unlearning in Graph Neural Networks
Authors: Jiali Cheng, George Dasoulas, Huan He, Chirag Agarwal, Marinka Zitnik
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments on seven real-world graphs, showing that GNNDELETE outperforms existing approaches by up to 38.8% (AUC) on edge, node, and node feature deletion tasks, and 32.2% on distinguishing deleted edges from non-deleted ones. Additionally, GNNDELETE is efficient, taking 12.3x less time and 9.3x less space than retraining GNN from scratch on Word Net18. |
| Researcher Affiliation | Collaboration | Jiali Cheng1 George Dasoulas*2 Huan He*2 Chirag Agarwal3 Marinka Zitnik2 1University of Massachusetts Lowell 2Harvard University 3Adobe |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks labeled as such. |
| Open Source Code | Yes | Code and datasets for GNNDELETE can be found at https://github.com/mims-harvard/GNNDelete. |
| Open Datasets | Yes | We use 5 homogeneous graphs: Cora (Bojchevski & Günnemann, 2018), Pub Med (Bojchevski & Günnemann, 2018), DBLP (Bojchevski & Günnemann, 2018), CS (Bojchevski & Günnemann, 2018), OGB-Collab (Hu et al., 2020), and 2 heterogeneous graphs: OGB-Bio KG (Hu et al., 2020), and Word Net18RR (Dettmers et al., 2018). |
| Dataset Splits | Yes | We sample 5% of the total edges as the test set (Et) to evaluate the model s performance on link prediction and sample another 5% as validation set for selecting the best model. |
| Hardware Specification | No | The paper does not provide specific hardware details such as GPU or CPU models, memory specifications, or cloud computing instance types used for running the experiments. |
| Software Dependencies | No | The paper mentions various GNN architectures (e.g., GCN, GAT, GIN, R-GCN, R-GAT) and cites their original papers, but does not provide specific version numbers for software libraries, programming languages, or other ancillary software dependencies necessary for replication. |
| Experiment Setup | Yes | To perform edge deletion tasks, we delete a varying proportion of edges in Ed between [0.5%-5.0%] of the total edges, with a step size of 0.5%. For larger datasets such as OGB (Hu et al., 2020), we limit the maximum deletion ratio to 2.5%... W l D = arg min W l D λLl DEC + (1 λ)Ll NI, where λ [0, 1] is a regularization coefficient that balances the trade-off between the two properties, L refers to the distance function. We use Mean Squared Error (MSE) throughout the experiments... For all methods, we use a 2-layer GCN/R-GCN architecture with a trainable entity and relation embeddings with 128, 64, and 32 hidden dimensions trained on three datasets... |