Anonymous Walk Embeddings
Authors: Sergey Ivanov, Evgeny Burnaev
ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experimental study shows state-of-the-art classification accuracy of feature-based AWE on real datasets. |
| Researcher Affiliation | Collaboration | 1Skolkovo Institute of Science and Technology, Moscow, Russia 2Criteo Research, Paris, France. |
| Pseudocode | Yes | Figure 3. A framework for learning data-driven anonymous walk embeddings. |
| Open Source Code | Yes | Code can be found at https://github.com/nd7141/AWE |
| Open Datasets | Yes | We evaluate performance on two sets of graphs. One set contains unlabeled graph data and is related to social networks (Yanardag & Vishwanathan, 2015). Another set contains graphs with labels on node and/or edges and originates from bioinformatics (Shervashidze et al., 2011). Statistics of these ten graph datasets presented in Table 1. |
| Dataset Splits | Yes | We perform a 10-fold cross-validation and for each fold we estimate SVM parameter C from the range [0.001, 0.01, 0.1, 1, 10] using validation set. |
| Hardware Specification | Yes | We run the experiments on a machine with Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz and 32GB RAM. |
| Software Dependencies | No | The paper mentions using an 'SVM classifier' and 'sampled softmax' but does not specify versions for any key software libraries or dependencies. |
| Experiment Setup | Yes | For feature-based anonymous walk embeddings (Def. 1), we choose length l of walks from the range [2, 3, . . . , 10] and approximate actual distribution of anonymous walks using sampling equation (2) with ε = 0.1 and δ = 0.05. For data-driven anonymous walk embeddings (Def. 4), we set length of walks l = 10 to generate a corpus of cooccurred anonymous walks. We run gradient descent with 100 iterations for 100 epochs with batch size that we vary from the range [100, 500, 1000, 5000, 10000]. Context walks are drawn from a window, which size varies in the range [2, 4, 8, 16]. The embedding size of walks and graphs da and dg equals to 128. Finally, candidate sampling function for softmax equation (4) chooses uniform or loguniform distribution of sampled classes. For RBF kernel function we choose parameter σ from the range [10 5, 10 4, . . . , 1, 10]; for Polynomial function we set c = 0 and d = 2. |