Optimal Transport for structured data with application on graphs

Authors: Vayer Titouan, Nicolas Courty, Romain Tavenard, Chapel Laetitia, Rémi Flamary

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now illustrate the behaviour of FGW on synthetic and real datasets. The algorithms presented in the previous section have been implemented using the Python Optimal Transport toolbox (Flamary & Courty, 2017) and will be released upon publication.4.2. Graph-structured data classificationTable 1. Average classification accuracy on the graph datasets with vector attributes.
Researcher Affiliation Academia 1Univ. Bretagne-Sud, CNRS, IRISA, F-56000 Vannes 2Univ. Cˆote d Azur, CNRS, OCA Lagrange, F-06000 Nice 3Univ. Rennes, CNRS, LETG, F-35000 Rennes.
Pseudocode Yes Algorithm 1 Conditional Gradient (CG) for FGWAlgorithm 2 Line-search for CG (q = 2)
Open Source Code No The algorithms presented in the previous section have been implemented using the Python Optimal Transport toolbox (Flamary & Courty, 2017) and will be released upon publication.
Open Datasets Yes BZR, COX2 (Sutherland et al., 2003), PROTEINS, ENZYMES (Borgwardt & Kriegel, 2005), CUNEIFORM (Kriege et al., 2018) and SYNTHETIC (Feragen et al., 2013) are vector attributed graphs. MUTAG (Debnath et al., 1991), PTC-MR (Kriege et al., 2016) and NCI1 (Wale et al., 2008) contain graphs with discrete attributes derived from small molecules. IMDB-B, IMDB-M (Yanardag & Vishwanathan, 2015) contain unlabeled graphs derived from social networks. All datas are available in (Kersting et al., 2016).
Dataset Splits Yes To provide compare between the methods, most papers about graph classification usually perform a nested cross validation (using 9 folds for training, 1 for testing, and reporting the average accuracy of this experiment repeated 10 times) and report accuracies of the other methods taken from the original papers.
Hardware Specification Yes We gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X GPU used for this research.
Software Dependencies No The algorithms presented in the previous section have been implemented using the Python Optimal Transport toolbox (Flamary & Courty, 2017) and will be released upon publication.
Experiment Setup Yes For all methods using SVM, we cross validate the parameter C {10^-7, 10^-6, ..., 10^7}. The range of the WL parameter H is {0, 1..., 10}, and we also compute this kernel with H fixed at 2, 4. The decay factor λ for RWK {10^-6, 10^-5..., 10^-2}, for the GK kernel we set the graphlet size κ = 3 and cross validate the precision level ϵ and the confidence δ as in the original paper (Shervashidze et al., 2009). The tmax parameter for PROPAK is chosen within {1, 3, 5, 8, 10, 15, 20}. For PSCN, we choose the normalized betweenness centrality as labeling procedure and cross validate the batch size in {10, 15, ..., 35} and number of epochs in {10, 20, ..., 100}. Finally for FGW, γ is cross validated within {2^-10, 2^-9, ..., 2^10} and α is cross validated via a logspace search in [0, 0.5] and symmetrically [0.5, 1] (15 values are drawn).