Graph Scan Statistics With Uncertainty
Authors: Jose Cadena, Arinjoy Basak, Anil Vullikanti, Xinwei Deng
AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our proposed methods on synthetic and real datasets, and we observe that our methods give significant improvement in the detection power as well as optimization objective, relative to a baseline. |
| Researcher Affiliation | Academia | Jose Cadena, Arinjoy Basak, Anil Vullikanti Department of Computer Science and Biocomplexity Institute, Virginia Tech, Blacksburg, VA 24061 {jcadena,arinjoyb,vsakumar}@vt.edu Xinwei Deng Department of Statistics, Virginia Tech, Blacksburg, VA 24061 xdeng@vt.edu |
| Pseudocode | Yes | Algorithm 1: AGGREGATESAA((G(V, E), αmax), k, ϵ).; Algorithm 2: MAXMINLPROUND(V, w, k) for Max-Min formulation without connectivity constraints.; Algorithm 3: BESTMAX((G(V, E), αmax), k, ϵ) for Max-Min formulation with connectivity constraints. |
| Open Source Code | No | The paper states, "Many details are omitted for brevity, and are available at (Cadena and others 2018)," providing a URL. However, this URL links directly to the published PDF of the paper itself, not to source code or supplementary material containing code. Therefore, no concrete access to source code is provided. |
| Open Datasets | Yes | The Northeastern USA Benchmark (NEast). This dataset (Kulldorff, Tango, and Park 2003)...; Battle of the Water Sensor Networks (BWSN) This dataset (Ostfeld and others 2008)... |
| Dataset Splits | No | The paper describes how synthetic and real datasets were constructed and perturbed for evaluation but does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for model development or evaluation. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper describes the algorithms and their theoretical properties but does not list any specific software dependencies or libraries with version numbers (e.g., Python 3.x, PyTorch 1.x, scikit-learn 0.x) that were used in the implementation or experimentation. |
| Experiment Setup | Yes | We simulate anomalous clusters in this network as follows: A cluster consists of a node selected at random and all its neighbors. ... The counts for a node v inside the cluster are sampled from Poisson(qb(v)). We perform experiments with values of q of the form q = βp, where β > 1 is a parameter that we call signal strength. ... Then, we perturb the counts generated above using Gaussian noise. That is, for each node v, we sample and round down a count x(v) N(c(v), σ2), where σ2 is a noise parameter. ... We control the noise on the sensors with a parameter ϵ. With probability ϵ, the real p-value of a sensor in the network is replaced by a random p-value uniformly sampled from the interval [0, 1]. |