reproducibilityindex.ai

Going beyond persistent homology using persistent homology

Authors: Johanna Immonen, Amauri Souza, Vikas Garg

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Theoretical contributions of this work... (Figure 1)... In this section, we compare Re PHINE to standard persistence diagrams from an empirical perspective. Our main goal is to evaluate whether our method enables powerful graph-level representation, confirming our theoretical analysis. Therefore, we conduct two main experiments. The first one leverages an artificially created dataset, expected to impose challenges to persistent homology and MP-GNNs. The second experiment aims to assess the predictive performance of Re PHINE in combination with GNNs on popular benchmarks for graph classification.
Researcher Affiliation	Collaboration	Johanna Immonen University of Helsinki johanna.x.immonen@helsinki.fi Amauri H. Souza Aalto University Federal Institute of Ceará amauri.souza@aalto.fi Vikas Garg Aalto University Yai Yai Ltd vgarg@csail.mit.edu
Pseudocode	Yes	Algorithm 1 Computing persistence diagrams; Algorithm 2 Re PHINE
Open Source Code	Yes	All methods were implemented in Py Torch [31], and our code is available at https://github.com/Aalto-Qu ML/Re PHINE.
Open Datasets	Yes	To assess the performance of Re PHINE on real data, we use six popular datasets for graph classification (details in the Supplementary): PROTEINS, IMDB-BINARY, NCI1, NCI109, MOLHIV and ZINC [7, 16, 20]... The ZINC and MOLHIV datasets have public splits. All models are initialized with a learning rate of 10^3 that is halved if the validation loss does not improve over 10 epochs.
Dataset Splits	Yes	For the TUDatasets, we obtain a random 80%/10%/10% (train/val/test) split, which is kept identical across five runs. The ZINC and MOLHIV datasets have public splits.
Hardware Specification	Yes	For all experiments, we use Tesla V100 GPU cards and consider a memory budget of 32GB of RAM.
Software Dependencies	No	The paper mentions implementation in 'Py Torch [31]' but does not specify a version number for PyTorch or any other software dependency.
Experiment Setup	Yes	Regarding the training, all models follow the same setting: we apply the Adam optimizer [21] for 2000 epochs with an initial learning rate of 10 4 that is decreased by half every 400 epochs. We use batches of sizes 5, 8, 32 for the cubic08, cubic10, and cubic12 datasets, respectively... We carry out grid-search for model selection. More specifically, we consider a grid comprised of a combination of {2, 3} GNN layers and {2, 4, 8} filtration functions. We set the number of hidden units in the Deep Set and GNN layers to 64, and of the filtration functions to 16... The GNN node embeddings are combined using a global mean pooling layer. Importantly, for all datasets, we use the same architecture for Re PHINE and color-based persistence diagrams.