GraphENS: Neighbor-Aware Ego Network Synthesis for Class-Imbalanced Node Classification

Authors: Joonhyung Park, Jaeyun Song, Eunho Yang

ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We validate our method on various benchmark datasets including citation networks (Sen et al., 2008), and Amazon product co-purchasing networks (Shchur et al., 2018) with diverse architectures such as GCN (Kipf & Welling, 2017), GAT (Velickovic et al.), and Graph SAGE (Hamilton et al., 2017), and confirm that our approach consistently outperforms baseline methods over various settings.
Researcher Affiliation Collaboration Joonhyung Park1 , Jaeyun Song1 , Eunho Yang1,2 Korea Advanced Institute of Science and Technology (KAIST)1, AITRICS2 {deepjoon, mercery, eunhoy}@kaist.ac.kr
Pseudocode Yes Our full algorithm is also provided in the Appendix E (Algorithm 1).
Open Source Code No The paper does not contain any explicit statements about releasing source code or provide a link to a code repository for the described methodology.
Open Datasets Yes We validate Graph ENS on five benchmark datasets: Cora, Cite Seer, Pub Med for citation networks (Sen et al., 2008), Amazon Photo and Amazon Computers for co-purchase graphs (Shchur et al., 2018).
Dataset Splits Yes To make validation/test sets balanced, we sample the same number of nodes from each class for validation/test sets. Then, the remaining nodes are assigned to the training set. We fix the least number of train nodes for all classes as 20. In semi-supervised setting, we set the imbalance ratio as 10. The detailed setup is provided in Appendix F.5.
Hardware Specification No The paper mentions training GNNs and models, but does not provide any specific details about the hardware used (e.g., CPU, GPU models, or memory specifications).
Software Dependencies No The paper refers to various GNN architectures (GCN, GAT, Graph SAGE), optimizers (Adam), and activation functions (ReLU, ELU), along with concepts like dropout, but it does not specify version numbers for any software libraries or dependencies.
Experiment Setup Yes We adopt Adam (Kingma & Ba, 2015) optimizer with the initial learning rate as 0.01 and train GNNs for 2000 epochs. The best models are selected with validation accuracy. We design a scheduler as a learning rate is halved if there is no improvement on validation loss for 100 iterations. We choose weight decay to all convolutional layers except for a linear classifier with 0.0005. We select the best model using validation accuracy over the training phase. For all datasets, we use Beta(2, 2) distribution to sample λ. Feature masking hyperparameter k and temperature τ are tuned among {1, 5, 10} and {1, 2}, respectively.