TellTail: Fast Scoring and Detection of Dense Subgraphs

Authors: Bryan Hooi, Kijung Shin, Hemank Lamba, Christos Faloutsos4150-4157

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our approach (a) provides theoretical guarantees of consistency, (b) scales quasi-linearly, and (c) outperforms baselines in synthetic and ground truth settings. Figure 1a shows that our measure outperforms baselines in its accuracy of identifying injected subgraphs. Figure 1b shows that our search algorithm is fast and scales quasi-linearly: it took 0.41s per subgraph to find, on a graph with 49 million edges, on a laptop computer.
Researcher Affiliation Academia Bryan Hooi,1 Kijung Shin,2 Hemank Lamba,3 Christos Faloutsos3 1National University of Singapore 2KAIST 3Carnegie Mellon University
Pseudocode Yes Algorithm 1 TELLTAIL+
Open Source Code Yes Reproducibility: Our code and datasets are available at https://bhooi.github.io/projects/telltail.
Open Datasets Yes Reproducibility: Our code and datasets are available at https://bhooi.github.io/projects/telltail.
Dataset Splits No The paper describes how null subgraphs and ground truth communities are used for evaluation but does not specify a validation dataset split (e.g., percentages or counts).
Hardware Specification No The paper mentions that experiments were run 'on a laptop computer' but does not provide specific details such as CPU model, GPU model, or memory, as required for a full hardware specification.
Software Dependencies No The paper mentions the use of 'Fibonacci heaps' but does not provide specific software dependencies with version numbers (e.g., Python version, library versions) for reproducibility.
Experiment Setup Yes In each trial we inject a dense subgraph, whose nodes are uniformly sampled at random from the graph s existing nodes, and whose size is one of n , 2 n , , 5 n (we obtain separate results for each of these sizes). ... We randomly generate a pool of null subgraphs: for each size (twice), we generate 200 random subgraphs of that size and add the community with highest mass into the pool.