reproducibilityindex.ai

Graph Scan Statistics With Uncertainty

Authors: Jose Cadena, Arinjoy Basak, Anil Vullikanti, Xinwei Deng

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our proposed methods on synthetic and real datasets, and we observe that our methods give signiﬁcant improvement in the detection power as well as optimization objective, relative to a baseline.
Researcher Affiliation	Academia	Jose Cadena, Arinjoy Basak, Anil Vullikanti Department of Computer Science and Biocomplexity Institute, Virginia Tech, Blacksburg, VA 24061 {jcadena,arinjoyb,vsakumar}@vt.edu Xinwei Deng Department of Statistics, Virginia Tech, Blacksburg, VA 24061 xdeng@vt.edu
Pseudocode	Yes	Algorithm 1: AGGREGATESAA((G(V, E), αmax), k, ϵ).; Algorithm 2: MAXMINLPROUND(V, w, k) for Max-Min formulation without connectivity constraints.; Algorithm 3: BESTMAX((G(V, E), αmax), k, ϵ) for Max-Min formulation with connectivity constraints.
Open Source Code	No	The paper states, "Many details are omitted for brevity, and are available at (Cadena and others 2018)," providing a URL. However, this URL links directly to the published PDF of the paper itself, not to source code or supplementary material containing code. Therefore, no concrete access to source code is provided.
Open Datasets	Yes	The Northeastern USA Benchmark (NEast). This dataset (Kulldorff, Tango, and Park 2003)...; Battle of the Water Sensor Networks (BWSN) This dataset (Ostfeld and others 2008)...
Dataset Splits	No	The paper describes how synthetic and real datasets were constructed and perturbed for evaluation but does not specify explicit training, validation, or test dataset splits (e.g., percentages or sample counts) for model development or evaluation.
Hardware Specification	No	The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments.
Software Dependencies	No	The paper describes the algorithms and their theoretical properties but does not list any specific software dependencies or libraries with version numbers (e.g., Python 3.x, PyTorch 1.x, scikit-learn 0.x) that were used in the implementation or experimentation.
Experiment Setup	Yes	We simulate anomalous clusters in this network as follows: A cluster consists of a node selected at random and all its neighbors. ... The counts for a node v inside the cluster are sampled from Poisson(qb(v)). We perform experiments with values of q of the form q = βp, where β > 1 is a parameter that we call signal strength. ... Then, we perturb the counts generated above using Gaussian noise. That is, for each node v, we sample and round down a count x(v) N(c(v), σ2), where σ2 is a noise parameter. ... We control the noise on the sensors with a parameter ϵ. With probability ϵ, the real p-value of a sensor in the network is replaced by a random p-value uniformly sampled from the interval [0, 1].