Scalable Deep Gaussian Markov Random Fields for General Graphs
Authors: Joel Oskarsson, Per Sidén, Fredrik Lindsten
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The usefulness of the proposed model is verified by experiments on a number of synthetic and real world datasets, where it compares favorably to other both Bayesian and deep learning methods. |
| Researcher Affiliation | Collaboration | 1Division of Statistics and Machine Learning, Department of Computer and Information Science, Link oping University, Link oping, Sweden 2Arriver Software AB. |
| Pseudocode | No | No pseudocode or clearly labeled algorithm block was found in the paper. |
| Open Source Code | Yes | Our code is available at https://github.com/joeloskarsson/graph-dgmrf. |
| Open Datasets | Yes | Wikipedia graphs were created and made available7 by Rozemberczki et al. (2021).; classical California housing dataset (Kelley Pace & Barry, 1997). This dataset contains median house values of 20 640 housing blocks located in California. Based on their spatial coordinates we create a sparse graph by Delaunay triangulation (De Loera et al., 2010).; The wind speed data originates from the Wind Integration National Dataset Toolkit10. |
| Dataset Splits | Yes | We generally treat 50% of nodes as unobserved, chosen uniformly at random.; For the MLP baseline we consider the layer configurations (number of hidden layers dimensionality) {1 128, 1 512, 2 128, 2 512}... The layer configuration resulting in the lowest validation loss is then used for the ensemble.; The GNN models are trained in the same way as the MLP, also using 20% of the observed nodes for validation. |
| Hardware Specification | No | The paper mentions 'Using a consumer-grade GPU' but does not specify the exact model or other detailed hardware specifications for the experiments. |
| Software Dependencies | No | The paper mentions software like 'Py Torch', 'Py Torch Geometric', 'GPy Torch', and 'scikit-learn' but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | For training our DGMRF we use a learning rate of 0.01 and the Adam optimizer (Kingma & Ba, 2015) in all experiments. The model has not been observed to be sensitive to these choices so no extensive tuning has been done. Note that overfitting is not a considerable problem here. If necessary the learning rate can be tuned to make the ELBO converge, just using the training data (observed nodes). On synthetic data we train the model for 50 000 iterations, on the Wikipedia and California housing datasets 80 000 iterations (150 000 for 5-layer DGMRFs) and for the wind speed data 150 000 iterations. These numbers are large enough for the ELBO to converge and often unnecessarily high, meaning that runtimes could be slightly reduced with a more carfeful choice. In all experiments we use one DGMRF layer for G in the variational distribution q (see Eq. 11). At each iteration of training we draw 10 samples from q to estimate the expectation in the ELBO. |