reproducibilityindex.ai

Improved large-scale graph learning through ridge spectral sparsification

Authors: Daniele Calandriello, Alessandro Lazaric, Ioannis Koutis, Michal Valko

ICML 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically validate our theoretical ﬁndings by testing how (ε, γ)-sparsiﬁers improves computational complexity without sacriﬁcing ﬁnal accuracy. Dataset. We run experiments on the Amazon co-purchase graph (Sadhanala et al., 2016). This graph ﬁts our setting: It cannot be generated from vectorial data and is only artiﬁcially sparse, since the crawler that created it had no access to the true private co-purchase network held by Amazon. To compensate, Gleich & Mahoney (2015) use a densiﬁcation procedure that given the graph adjacency matrix AG, computes all k-step neighbors AG,k Pk s=1 As G. We make the graph unweighted for numerical stability. The ﬁnal graph has n = 334, 863 nodes and m = 98, 465, 352 edges, with an average degree of 294. We followed an approach similar to Sadhanala et al. (2016) and introduce a hand-designed smooth signal as a target. We then perform 2000 iterations of the power method to compute an approximation of the smallest eigenvector vmin, which is used as a smooth function over the graph. Baselines. For all setups, we compute an exact solution (up to convergence error) using a fast linear solver. Computing this EXACT baseline requires O(m log(n)) time and O(m) space and achieves the best performance. Afterwards, we compare three different sparsiﬁcation procedures to eval- Improved Large-Scale Graph Learning through Ridge Spectral Sparsiﬁcation uate if they can accelerate computation while preserving accuracy. We run Di SRe with different values of γ depending on the setting. For empirically strong heuristics, we attempted to uniformly subsample the edges, but at the sparsity level achieved by the other methods, the uniformly sampled sparsiﬁer is disconnected and highly inaccurate. Instead, we compare to the state-of-the-art k-neighbors (k N) heuristic by Sadhanala et al. (2016), which is just as fast as uniform sampling and more accurate in practice.
Researcher Affiliation	Collaboration	1Seque L team, INRIA Lille Nord Europe, France 2LCSL, IIT, Italy, and MIT, USA. 3New Jersey Institute of Technology, USA 4Facebook AI Research, Paris, France.
Pseudocode	Yes	Algorithm 1 The Di SRe algorithm. Input: G Output: HG 1: Partition G into k sub-graphs: 2: H1,ℓ Gℓ {(ei,j, qe = 1, ep1,e = 1)} 3: Initialize set S1 = {H1,ℓ}k ℓ=1 4: for h = 1, . . . , k 1 do 5: Pick two sparsiﬁers Hh,i , Hh,i from Sh 6: H Merge-Resparsify(Hh,i, Hh,i ) 7: Place H back into Sh+1 8: end for 9: Return HG, the last sparsiﬁer in Sk
Open Source Code	No	The paper does not provide explicit statements or links to open-source code for the methodology described.
Open Datasets	Yes	Dataset. We run experiments on the Amazon co-purchase graph (Sadhanala et al., 2016).
Dataset Splits	No	We set f = vmin and test different levels of noise, log10(σ) { 3, 2, 1, 0}. After constructing the sparsiﬁer H, we compute an approximate solution ef using LAPSMO (Eq. 2) with λ {10 3, 10 2, 10 1, 1, 10}. ... The labels are generated taking the sign of f = vmin and ℓ {20, 346, 672, 1000} labels are revealed. The labeled nodes are chosen at random so that 0 and 1 labels are balanced in the dataset.
Hardware Specification	No	Experiments presented in this paper were carried out using the Grid 5000 testbed, supported by a scientiﬁc interest group hosted by Inria and including CNRS, RENATER, and several universities as well as other organizations (see https://www.grid5000.fr).
Software Dependencies	No	The paper mentions using a 'fast linear solver' and 'SSD solver' but does not specify any software names with version numbers for reproducibility.
Experiment Setup	Yes	We then perform 2000 iterations of the power method to compute an approximation of the smallest eigenvector vmin, which is used as a smooth function over the graph. ... We set f = vmin and test different levels of noise, log10(σ) { 3, 2, 1, 0}. After constructing the sparsiﬁer H, we compute an approximate solution ef using LAPSMO (Eq. 2) with λ {10 3, 10 2, 10 1, 1, 10}. ... We run Stable-HFS with λ {10 6, 10 4, 10 2, 1}. ... We repeat each experiment 10 times with different sparsiﬁers and report the average performance of ef on the speciﬁc task and its standard deviation.