Connected Subgraph Detection with Mirror Descent on SDPs
Authors: Cem Aksoylar, Lorenzo Orecchia, Venkatesh Saligrama
ICML 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We present experiments on two datasets: a real world geographical network of disease outbreaks and elevated mean detection on very large random geometric graphs. In the former we compare the statistical detection performance of our mirror descent (MD) algorithm with subgraph detection methods from related work. For the latter we demonstrate the scalability of our method on large graphs. |
| Researcher Affiliation | Academia | 1Boston University, Massachusetts, USA. Correspondence to: Venkatesh Saligrama <srv@bu.edu>. |
| Pseudocode | Yes | Algorithm 1 Mirror Descent Algorithm |
| Open Source Code | No | The paper does not provide an explicit statement or a link to open-source code for the methodology described. |
| Open Datasets | Yes | We consider a geographical map and its corresponding network that are illustrated in Figures 1b and 1a respectively, with 129 nodes representing counties in the northeastern United States and average degree 4.7. Following (Patil et al., 2003) and (Qian et al., 2014; Qian & Saligrama, 2014), we consider an elevated mean Poisson formulation for modeling the diseased population... |
| Dataset Splits | No | To quantify detection performance, we threshold the scan statistics given by the algorithms with various threshold values and compute missed detection and false positive rates over a number of samples (50 for MD and LMIT, 25 for SA and NB) generated from both H0 and H1. |
| Hardware Specification | Yes | The experiments were run on MATLAB on a computer with an Intel i5 4590 processor. |
| Software Dependencies | No | The experiments were run on MATLAB on a computer with an Intel i5 4590 processor. This mentions software (MATLAB) but does not provide a version number. Other mentioned libraries or methods also lack version numbers for reproducibility. |
| Experiment Setup | Yes | For the MD method we consider the optimization value xx> M as the scan statistic, with T = 100 iterations, = 5 and different γ values to quantify the size and conductance of the anomalous graph. For LMIT we use the same anchor node as MD, anomaly size |S| = 16 corresponding to the ground truth and consider scan statistic x> diag(M). We search over a range of values for parameter γ. For SA and NB we consider the test statistic. (...) For MD, we chose γ2 = 10−3 for the thick cluster and 5 × 10−4 for the thin one... k = 10 vectors for approximating Y and ran the algorithm for T = 300 iterations. |