reproducibilityindex.ai

MixHop: Higher-Order Graph Convolutional Architectures via Sparsified Neighborhood Mixing

Authors: Sami Abu-El-Haija, Bryan Perozzi, Amol Kapoor, Nazanin Alipourfard, Kristina Lerman, Hrayr Harutyunyan, Greg Ver Steeg, Aram Galstyan

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	5. Experimental Design Given the model described above, a number of natural questions arise. In this section, we aim to design experiments which answer the following hypothesises: H1: The Mix Hop model learns delta operators. H2: Higher order graph convolutions using neighborhood mixing can outperform existing approaches (e.g. vanilla GCNs) on real semi-supervised learning tasks. H3: When learning a model architecture for Mix Hop the best performing architectures differ for each graph. To answer these questions, we design three experiments. Synthetic Experiments: This experiment uses a family of synthetic graphs which allow us to vary the correlation (or homophily) of the edges in a generated graph, and observe how different graph convolutional approaches respond. Real-World Experiments: This experiment evaluates Mix Hop s performance on a variety of noisy real world datasets, comparing against challenging baselines. Model Visualization Experiment: This experiment shows how an appropriately regularized Mix Hopmodel can learn different, task-dependent, architectures. 6. Experimental Results
Researcher Affiliation	Collaboration	Sami Abu-El-Haija 1 Bryan Perozzi 2 Amol Kapoor 2 Nazanin Alipourfard 1 Kristina Lerman 1 Hrayr Harutyunyan 1 Greg Ver Steeg 1 Aram Galstyan 1 1Information Sciences Institute, University of Southern California 2Google AI, New York.
Pseudocode	Yes	Algorithm 1 Mix Hop Graph Convolution Layer
Open Source Code	Yes	Our code is available on github.com/samihaija/mixhop.
Open Datasets	Yes	We conduct semi-supervised node classiﬁcation experiments on synthetic and real-world datasets. Synthetic Datasets: Our synthetic datasets are generated following Karimi et al. (2017). ... Real World Datasets: The experiments with real-world datasets follow the methodology proposed in Yang et al. (2016).
Dataset Splits	Yes	We randomly partition each graph into train, test, and validation node splits, all of equal size.
Hardware Specification	No	The paper mentions training models using TensorFlow but does not provide any specific hardware details such as GPU or CPU models, processor types, or memory specifications.
Software Dependencies	No	The paper states 'We construct a 2-layer network of our model using Tensor Flow (Abadi et al., 2016)' but does not provide a specific version number for TensorFlow or any other software dependencies.
Experiment Setup	Yes	For all experiments, we construct a 2-layer network of our model using Tensor Flow (Abadi et al., 2016). We train our models using a Gradient Descent optimizer for a maximum of 2000 steps, with an initial learning rate of 0.05 that decays by 0.0005 every 40 steps. We terminate training if validation accuracy does not improve for 40 consecutive steps; as a result, most runs ﬁnish in less than 200 steps. We use 5 10 4 L2 regularization on the weights, and dropout input and hidden layers. We note that the citation datasets are extremely sensitve to initializations; as such, we run all models 100 times, sort by the validation accuracy, and ﬁnally report the test accuracy for the top 50 runs. For all models we ran (our models in Tables 1 & 3, and all models in Table 3), we use a latent dimension of 60; Our default architecture evenly divided 60 dimensions are divided evenly to all \|P\| powers.