reproducibilityindex.ai

EqGNN: Equalized Node Opportunity in Graphs

Authors: Uriel Singer, Kira Radinsky8333-8341

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our classiﬁer over several graph datasets and sensitive attributes and show our algorithm reaches state-of-the-art results.
Researcher Affiliation	Academia	Technion, Israel Institute of Technology
Pseudocode	Yes	See the Appendixfor the full differential permutation loss algorithm.
Open Source Code	Yes	Git Hub repository with appendix, code, baselines, and data: https://github.com/urielsinger/Eq GNN
Open Datasets	Yes	Pokec (Takac and Zabovsky 2012). Pokec is a popular social network in Slovakia. An anonymized snapshot of the network was taken in 2012. User proﬁles include gender, age, hobbies, interest, education, etc. The original Pokec dataset contains millions of users. We sampled a subnetwork of the Zilinsky province. We create two datasets, where the sensitive attribute in one is the gender, and region in the other. The label used for classiﬁcation is the job of the user. The job ﬁeld was grouped in the following way: (1) education and student , (2) services & trade and construction , and (3) unemployed . NBA (Dai and Wang 2021) This dataset was presented in the Fair GNN baseline paper. The NBA Kaggle dataset contains around 400 basketball players with features including performance statistics, nationality, age, etc. This dataset was extended in (Dai and Wang 2021) to include the relationships of the NBA basketball players on Twitter. The binary sensitive attribute is whether a player is a U.S. player or an overseas player, while the task is to predict whether a salary of the player is over the median.
Dataset Splits	Yes	For all baselines, 50% of nodes are used for training, 25% for validation and 25% for testing.
Hardware Specification	Yes	All experiments used a single Nvidia P100 GPU with the average run of 5 minutes per seed for Pokec and 1 minute for NBA.
Software Dependencies	No	The paper mentions using the Adam optimizer, but it does not specify software dependencies like Python, PyTorch/TensorFlow versions, or other library versions.
Experiment Setup	Yes	For all baselines, 50% of nodes are used for training, 25% for validation and 25% for testing. The validation set is used for choosing the best model for each baseline throughout the training. As the classiﬁer is the only part of the architecture used for testing, an early stopping was implemented after its validation loss (Eq. 7) hasn t improved for 50 epochs. The epoch with the best validation loss was then used for testing. All results are averaged over 20 different train/validation/test splits for Pokec datasets and 40 for the NBA dataset. For fair comparison, we implemented grid-search for all baselines over λ {0.01, 0.1, 1, 10} for baselines with a discriminator, and γ {0, 50} for baselines with a covariance expression. For both Pokec datasets and for all baselines λ = 1 and γ = 50, while for NBA we end up using λ = 0.1 and γ = 50 expect for Fair GNN with λ = 0.01.