Boost then Convolve: Gradient Boosting Meets Graph Neural Networks

Authors: Sergei Ivanov, Liudmila Prokhorenkova

ICLR 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental With an extensive experimental comparison to the leading GBDT and GNN models, we demonstrate a significant increase in performance on a variety of graphs with tabular features. The code is available: https://github.com/nd7141/bgnn. We have performed a comparative evaluation of BGNN and Res-GNN against a wide variety of strong baselines and previous approaches on heterogeneous node prediction problems, achieving significant improvement in performance across all of them. This section outlines our experimental setting, the results on node regression and classification problems, and extracted feature representations. The results of our comparative evaluation for node regression are summarized in Table 2.
Researcher Affiliation Collaboration Sergei Ivanov Criteo AI Lab; Skoltech Paris, France s.ivanov@criteo.com Liudmila Prokhorenkova Yandex; HSE University; MIPT Moscow, Russia ostroumova-la@yandex-team.ru
Pseudocode Yes Algorithm 1 Training of BGNN
Open Source Code Yes The code is available: https://github.com/nd7141/bgnn.
Open Datasets Yes We utilize five real-world node regression datasets with different properties outlined in Table 1. Four of these datasets are heterogeneous, i.e., the input features are of different types, scales, and meaning. For example, for the VK dataset, the node features are both numerical (e.g., last time seen on the platform) and categorical (e.g., country of living and university). On the other hand, Wiki dataset is homogeneous, i.e., the node features are interdependent and correspond to the bag-of-words representations of Wikipedia articles. Additional details about the datasets can be found in Appendix C. For node classification, we use five datasets with different properties. Due to the lack of publicly available datasets with heterogeneous node features, we adopt the datasets House class and VK class from the regression task by converting the target labels into several discrete classes. We additionally include two sparse node classification datasets SLAP and DBLP coming from heterogeneous information networks (HIN) with nodes having different types. We also include one homogeneous dataset OGB-Ar Xiv (Hu et al., 2020a).
Dataset Splits Yes We ensure that the comparison is done fairly by training each model until the convergence with a reasonable set of hyperparameters evaluated on the validation set. We run each hyperparameter setting three times and take the average of the results. Furthermore, we have five random splits of the data, and the final number represents the average performance of the model for all five random seeds. More details about hyperparameters can be found in Appendix B. We use five random splits for train/validation/test with 0.6/0.2/0.2 ratio.
Hardware Specification No The paper does not provide specific hardware details (e.g., GPU/CPU models, memory specifications) used for running the experiments. It discusses training time but not the underlying hardware.
Software Dependencies No The paper mentions software like "Cat Boost", "Light GBM", and "Pytorch geometric" (used for GNNs) but does not provide specific version numbers for these software dependencies, which is required for reproducibility.
Experiment Setup Yes We ensure that the comparison is done fairly by training each model until the convergence with a reasonable set of hyperparameters evaluated on the validation set. More details about hyperparameters can be found in Appendix B. Light GBM: number of leaves is {15, 63}, ||λ||2 = 0, boosting type is gbdt, number of epochs is 1000, early stopping rounds is 100. Cat Boost: depth is {4, 6}, ||λ||2 = 0, number of epochs is 1000, early stopping rounds is 100. FCNN: number of layers is {2, 3}, dropout is {0., 0.5}, hidden dimension is 64, number of epochs is 5000, early stopping rounds is 2000. GNN: dropout rate is {0., 0.5}, hidden dimension is 64, number of epochs is 2000, early stopping rounds is 200. GAT, GCN, and AGNN models have two convolutional layers with dropout and ELU activation function (Clevert et al., 2016). APPNP has a two-layer fully-connected neural network with dropout and ELU activation followed by a convolutional layer with k = 10 and α = 0.1. We use eight heads with eight hidden neurons for GAT model. Res-GNN: dropout rate is {0., 0.5}, hidden dimension is 64, number of epochs is 1000, early stopping rounds is 100. We also tune whether to use solely predictions of Cat Boost model or append them to the input features. Cat Boost model is trained for 1000 epochs. BGNN: dropout rate is {0., 0.5}, hidden dimension is 64, number of epochs is 200, early stopping rounds is 10, number of trees and backward passes per epoch is {10, 20}, depth of the tree is 6. We also tune whether to use solely predictions of Cat Boost model or append them to the input features. For all models, we also perform a hyperparameter search on learning rate in {0.1, 0.01}.