Interpreting Unfairness in Graph Neural Networks via Training Node Attribution

Authors: Yushun Dong, Song Wang, Jing Ma, Ninghao Liu, Jundong Li

AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We verify the validity of PDD and the effectiveness of influence estimation through experiments on real-world datasets.
Researcher Affiliation Academia 1University of Virginia 2University of Georgia {yd6eb, sw3wv, jm3mr, jundong}@virginia.edu, ninghao.liu@uga.edu
Pseudocode Yes Algorithm 1: Node Influence on Model Bias Estimation
Open Source Code Yes Open-source code can be found at https://github.com/yushundong/BIND.
Open Datasets Yes Four real-world datasets are adopted in our experiments, including Income, Recidivism, Pokec-z, and Pokec-n. Specifically, Income is collected from Adult Data Set (Dua and Graff 2017). Recidivism is collected from (Jordan and Freiburger 2015). Pokec-z and Pokec-n are collected from Pokec, which is a popular social network in Slovakia (Takac and Zabovsky 2012).
Dataset Splits No The paper frequently mentions using a "test set" for evaluation, but it does not specify explicit training/validation/test splits (e.g., percentages or counts) or a cross-validation strategy.
Hardware Specification No The paper does not specify any hardware details such as GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions using GCN as the backbone model, but it does not specify version numbers for any software dependencies (e.g., specific Python, PyTorch, or library versions).
Experiment Setup Yes Specifically, we set Γ = λΓSP + (1 λ)ΓEO and estimate the node influence on Γ to consider both statistical parity and equal opportunity. We then set a budget k, and follow the strategy adopted in Section to select and delete a set of training nodes with the largest positive influence summation on Γ under this budget. We set λ = 0.5 to assign statistical parity and equal opportunity the same weight, and perform experiments with k being 1% (denoted as BIND 1%) and 10% (denoted as BIND 10%) of the total number of training nodes.