TellTail: Fast Scoring and Detection of Dense Subgraphs
Authors: Bryan Hooi, Kijung Shin, Hemank Lamba, Christos Faloutsos4150-4157
AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our approach (a) provides theoretical guarantees of consistency, (b) scales quasi-linearly, and (c) outperforms baselines in synthetic and ground truth settings. Figure 1a shows that our measure outperforms baselines in its accuracy of identifying injected subgraphs. Figure 1b shows that our search algorithm is fast and scales quasi-linearly: it took 0.41s per subgraph to find, on a graph with 49 million edges, on a laptop computer. |
| Researcher Affiliation | Academia | Bryan Hooi,1 Kijung Shin,2 Hemank Lamba,3 Christos Faloutsos3 1National University of Singapore 2KAIST 3Carnegie Mellon University |
| Pseudocode | Yes | Algorithm 1 TELLTAIL+ |
| Open Source Code | Yes | Reproducibility: Our code and datasets are available at https://bhooi.github.io/projects/telltail. |
| Open Datasets | Yes | Reproducibility: Our code and datasets are available at https://bhooi.github.io/projects/telltail. |
| Dataset Splits | No | The paper describes how null subgraphs and ground truth communities are used for evaluation but does not specify a validation dataset split (e.g., percentages or counts). |
| Hardware Specification | No | The paper mentions that experiments were run 'on a laptop computer' but does not provide specific details such as CPU model, GPU model, or memory, as required for a full hardware specification. |
| Software Dependencies | No | The paper mentions the use of 'Fibonacci heaps' but does not provide specific software dependencies with version numbers (e.g., Python version, library versions) for reproducibility. |
| Experiment Setup | Yes | In each trial we inject a dense subgraph, whose nodes are uniformly sampled at random from the graph s existing nodes, and whose size is one of n , 2 n , , 5 n (we obtain separate results for each of these sizes). ... We randomly generate a pool of null subgraphs: for each size (twice), we generate 200 random subgraphs of that size and add the community with highest mass into the pool. |