Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Directed Probabilistic Watershed

Authors: Enrique Fita Sanmartin, Sebastian Damrich, Fred A. Hamprecht

NeurIPS 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental we run an illustrative experiment to show the performance of the DProb WS on node classification. ... We compare the DProb WS with the methods exposed in [9, 11, 49] referred as ARW, GTG and LLUD respectively.
Researcher Affiliation Academia Enrique Fita Sanmartín, Sebastian Damrich, Fred A. Hamprecht HCI/IWR at Heidelberg University, 69120 Heidelberg, Germany {enrique.fita.sanmartin, sebastian.damrich, fred.hamprecht} @iwr.uni-heidelberg.de
Pseudocode Yes Algorithm 1: DProb WS
Open Source Code Yes Code publicly available at https://github.com/hci-unihd/Directed_Probabilistic_ Watershed.git
Open Datasets Yes We construct k NN graphs, with k = 5, from the UCI datasets [10] Digits [44] and 20Newsgroups[23]. Additionally we consider the Email-EUnetwork [26, 27, 46], the Cora network[29] and Citeseer X network[33].
Dataset Splits No The paper describes how labeled nodes are sampled as seeds ('sampling a certain fraction r of all nodes from each class uniformly as seeds') but does not specify distinct training, validation, and test splits or a cross-validation setup.
Hardware Specification No The paper does not specify the hardware used for the experiments (e.g., GPU models, CPU types, or memory specifications).
Software Dependencies No The paper does not specify any software dependencies with version numbers.
Experiment Setup Yes We construct k NN graphs, with k = 5... Inspired by [9], we sample a certain fraction r of all nodes from each class uniformly as seeds. In Figure 3, we show the average accuracy over 20 runs for each of the r values between 0.1 and 0.9.