Active Learning of Convex Halfspaces on Graphs

Authors: Maximilian Thiessen, Thomas Gaertner

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental After presenting an empirical validation of our assumptions and approach in Section 5, we conclude with a discussion of future work. and 5 Experiments Having discussed our bounds on the query complexity and the conceptual benefits of the halfspace assumption in terms of theoretical upper bounds, we want to see whether we also get empirical benefits on data, as well.
Researcher Affiliation Academia Maximilian Thiessen Research Unit of Machine Learning TU Wien, Vienna, Austria maximilian.thiessen@tuwien.ac.at Thomas Gärtner Research Unit of Machine Learning TU Wien, Vienna, Austria thomas.gaertner@tuwien.ac.at
Pseudocode Yes Algorithm 1: Ray-based Active Halfspace Learning on Graphs Input: graph G, oracle access to labels λ Output: halfspaces corresponding to both classes
Open Source Code Yes https://github.com/maxthiessen/active_graph_halfspaces
Open Datasets Yes Table 2: Number of convex communities. dataset convex communities DBLP 4308/5000 Amazon 3999/5000 Youtube 2990/5000 Live Journal 1649/5000 Orkut 363/5000 Eu-core 7/42 and As datasets, we use ε-nearest-neighbour graphs of two moons1 and Iris2, see Figures 1a and 1b. The parameter ε was selected such that the graph is connected and the labels are halfspace separable. Additionally, we use a 20 20 grid and a 210 hypercube labelled with a random halfspace. Footnote 1: make_moons(noise=0.1, random_state=0) in scikit-learn [Pedregosa et al., 2011] and Footnote 2: first class vs the other two [Fisher, 1936].
Dataset Splits No The paper does not provide explicit training/test/validation dataset splits. It states: 'We check the accuracy on the whole graph after each query of the 20 queries.'
Hardware Specification Yes We implemented all approaches in python3 and ran the experiments on an Ubuntu 21.04 laptop with 32GB main memory.
Software Dependencies No The paper mentions 'python3' and 'scikit-learn' but does not provide specific version numbers for these or other key software dependencies.
Experiment Setup Yes The parameter ε was selected such that the graph is connected and the labels are halfspace separable. Additionally, we use a 20 20 grid and a 210 hypercube labelled with a random halfspace. and The other three approaches perform label propagation [Zhu et al., 2003a] with the default Gaussian similarity and length-scale σ = 1 for prediction.