Expected Probabilistic Hierarchies
Authors: Marcel Kollovieh, Bertrand Charpentier, Daniel Zügner, Stephan Günnemann
NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate EPH on synthetic and real-world datasets including vector and graph datasets. EPH outperforms all other approaches quantitatively and provides meaningful hierarchies in qualitative evaluations. |
| Researcher Affiliation | Collaboration | Marcel Kollovieh1,2,3 Bertrand Charpentier4 Daniel Zügner5 Stephan Günnemann1,2,3,4 1 School of Computation, Information and Technology, Technical University of Munich 2 Munich Data Science Institute 3 Munich Center for Machine Learning 4 Pruna AI 5 Microsoft Research AI for Science |
| Pseudocode | Yes | In the following, we provide a formal description of our EPH algorithm, the subgraph sampling, and how we normalize graphs. Algorithm 1 EPH, Algorithm 2 Normalize Graph, Algorithm 3 Sample Subgraph |
| Open Source Code | Yes | Code available at https://www.cs.cit.tum.de/daml/expected-probabilistic-hierarchies |
| Open Datasets | Yes | We use the datasets Polblogs [1], Brain [2], Citeseer [34], Genes [11], Cora-ML [25, 5], Open Flight [29], Wiki Physics [3], and DBLP [38]. ... we selected the seven datasets Zoo, Iris, Glass, Digits, Segmentation, Spambase, and Letter from the UCI Machine Learning repository [15]. Furthermore, we also use Cifar-100 [23]. |
| Dataset Splits | No | Since we are in an unsupervised setting, we have no train/test split, i.e., we train and evaluate on the whole graph. |
| Hardware Specification | Yes | While Hyp HC, FPH, and EPH are executed on a GPU (NVIDIA A100), the remaining methods do not support or do not require GPU acceleration. |
| Software Dependencies | No | The paper mentions optimizers (PAdamax) and refers to some methods' implementations (e.g., Deep Walk embeddings [32]), but does not specify software library versions (e.g., PyTorch 1.x, Python 3.x, CUDA x.x). |
| Experiment Setup | Yes | We use n = 512 internal nodes, compress full hierarchies using the scheme presented by Charpentier and Bonald [8], and use Deep Walk embeddings [32] on the graphs for methods that require features. We train EPH using PAdamax (projected Adamax [21]), reduce the learning rate for B by a factor of 0.1 every 1000 epochs, and reset the probabilistic hierarchy to the so-far best discrete hierarchy. To approximate the expectation of EPH, we use 20 samples, except for Spambase, Letter, and Cifar-100, where we use 10, 1, and 1, respectively, to reduce the runtime. ... Both EPH and FPH are initialized using the average linkage algorithm. |