reproducibilityindex.ai

Distributed Submodular Cover: Succinctly Summarizing Massive Data

Authors: Baharan Mirzasoleiman, Amin Karbasi, Ashwinkumar Badanidiyuru, Andreas Krause

NeurIPS 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In our extensive experiments, we demonstrate the effectiveness of our approach on several applications, including active set selection, exemplar based clustering, and vertex cover on tens of millions of data points using Spark.
Researcher Affiliation	Collaboration	Baharan Mirzasoleiman ETH Zurich Amin Karbasi Yale University Ashwinkumar Badanidiyuru Google Andreas Krause ETH Zurich
Pseudocode	Yes	Algorithm 1 Approximate Submodular Cover; Algorithm 2 Approximate OPTCARD; Algorithm 3 DISCOVER; Algorithm 4 Greedy Distributed Submodular Maximization (GREEDI)
Open Source Code	No	The paper does not provide a link or explicit statement about the availability of its source code.
Open Datasets	Yes	We perform our experiments on a set of 10,000 Tiny Images [28]...We use the Parkinsons Telemonitoring dataset [29]...As our large scale experiment, we applied DISCOVER to the Friendster network... [30]
Dataset Splits	No	The paper describes running experiments on datasets and evaluating coverage percentages, but it does not specify train, validation, or test splits for data partitioning or model training in the conventional sense.
Hardware Specification	Yes	Our experimental infrastructure was a cluster of 8 quad-core machines with 32GB of memory each, running Spark.
Software Dependencies	No	The paper mentions 'running Spark' but does not specify the version number of Spark or any other software dependencies.
Experiment Setup	Yes	We set the number of reducers to m = 64...We ﬁrst distributed the data uniformly at random to the machines, where each machine received 1,025,130 vertices ( 12.5GB RAM). Then we start with ℓ= 1, perform a map/reduce task to extract one element...We examine the performance of DISCOVER by obtaining covers for 50%, 30%, 20% and 10% of the whole graph...α = 1.