reproducibilityindex.ai

Anonymous Walk Embeddings

Authors: Sergey Ivanov, Evgeny Burnaev

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experimental study shows state-of-the-art classiﬁcation accuracy of feature-based AWE on real datasets.
Researcher Affiliation	Collaboration	1Skolkovo Institute of Science and Technology, Moscow, Russia 2Criteo Research, Paris, France.
Pseudocode	Yes	Figure 3. A framework for learning data-driven anonymous walk embeddings.
Open Source Code	Yes	Code can be found at https://github.com/nd7141/AWE
Open Datasets	Yes	We evaluate performance on two sets of graphs. One set contains unlabeled graph data and is related to social networks (Yanardag & Vishwanathan, 2015). Another set contains graphs with labels on node and/or edges and originates from bioinformatics (Shervashidze et al., 2011). Statistics of these ten graph datasets presented in Table 1.
Dataset Splits	Yes	We perform a 10-fold cross-validation and for each fold we estimate SVM parameter C from the range [0.001, 0.01, 0.1, 1, 10] using validation set.
Hardware Specification	Yes	We run the experiments on a machine with Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz and 32GB RAM.
Software Dependencies	No	The paper mentions using an 'SVM classiﬁer' and 'sampled softmax' but does not specify versions for any key software libraries or dependencies.
Experiment Setup	Yes	For feature-based anonymous walk embeddings (Def. 1), we choose length l of walks from the range [2, 3, . . . , 10] and approximate actual distribution of anonymous walks using sampling equation (2) with ε = 0.1 and δ = 0.05. For data-driven anonymous walk embeddings (Def. 4), we set length of walks l = 10 to generate a corpus of cooccurred anonymous walks. We run gradient descent with 100 iterations for 100 epochs with batch size that we vary from the range [100, 500, 1000, 5000, 10000]. Context walks are drawn from a window, which size varies in the range [2, 4, 8, 16]. The embedding size of walks and graphs da and dg equals to 128. Finally, candidate sampling function for softmax equation (4) chooses uniform or loguniform distribution of sampled classes. For RBF kernel function we choose parameter σ from the range [10 5, 10 4, . . . , 1, 10]; for Polynomial function we set c = 0 and d = 2.