Low-dimensional statistical manifold embedding of directed graphs
Authors: Thorben Funke, Tian Guo, Alen Lancic, Nino Antulov-Fantulin
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments show that our proposed embedding is better in preserving the global geodesic information of graphs, as well as outperforming existing embedding models on directed graphs in a variety of evaluation metrics, in an unsupervised setting. |
| Researcher Affiliation | Academia | Thorben Funke L3S Research Center Leibniz University Hannover Hannover, Germany Tian Guo Computationl Social Science ETH Zürich Zurich, Switzerland Alen Lancic Faculty of Science Department of Mathematics University of Zagreb, Croatia Nino Antulov-Fantulin Computationl Social Science ETH Zürich Zurich, Switzerland anino@eth.ch |
| Pseudocode | Yes | To clarify our two algorithms, we include the pseudo-code of the full algorithm (Algorithm 1) and the scalable variant (Algorithm 2). |
| Open Source Code | No | The paper does not provide a specific link to source code or explicitly state that the code for the described methodology is publicly available. |
| Open Datasets | Yes | From the Koblenz Network Collection [30] we retrieved three datasets of different sizes and connectivity. See Table 1 for an overview. Political blogs. The small dataset is compiled during the 2004 US election [1]. Cora. [50] consists of citations between computer science publications and was used as an example in baseline APP [59] and HOPE [39] as well. Publication network. With a higher density but lower reciprocity, our largest example is the publication network given by ar Xiv s High Energy Physics Theory (hep-th) section [31]. |
| Dataset Splits | No | The paper does not explicitly provide specific train/validation/test dataset splits (e.g., percentages or counts) or describe a formal cross-validation setup for its experiments. |
| Hardware Specification | Yes | HOPE was executed with GNU Octave version 4.4.1, and the other methods were executed in Python 3.6.7 and Tensorflow on a server with 258 GB RAM and a NVIDIA Titan Xp GPU. |
| Software Dependencies | Yes | HOPE was executed with GNU Octave version 4.4.1, and the other methods were executed in Python 3.6.7 and Tensorflow on a server with 258 GB RAM and a NVIDIA Titan Xp GPU. |
| Experiment Setup | Yes | With the nonlinear interactions in our embedding, our method needs only a small number of dimensions, and for the presented results we used an embedding into a 2-variate normal distribution, i.e. k = 2 and the number of free parameters for each node is 4. For the random walk based methods APP and Deep Walk, we unified both default settings with 20 random walks for each node and length 100. The embedding dimension was set to K = 4, where we allowed APP to use two 4 dimensional vectors. In the same fashion, we set the embedding dimension for HOPE to 4, resulting in two |V | 4 matrices. For our proposed embedding, we evaluate both the exact and approximate version. For the approximate one, we report the results based on B = 10 and B = 100 samples for each node. We executed the optimization with β 1/2, 1 and consistently retrieved the best results for β = 1/2. The initial means µu and co-variances Σu are drawn uniformly from [0, 10] and the initial co-variances σi u are drawn uniformly from [4, 7]. As initial value of τ we selected 2.5. In the experiments, we applied Adam optimizer with learning rates in {.001, .01, .1} and retrieved the reported results with the .1 for political blogs and .001 for the others. For our full method, we used no batching for the political blogs network, 410 batches for Cora and 2777 batches for ar Xiv hep-th with and without shuffling between each epoch, where we saw an increased performance with shuffling especially for the larger datasets. The approximated approach uses no batch for the variant with B = 10 and 10 batches for B = 100. |