An Efficient Streaming Algorithm for the Submodular Cover Problem
Authors: Ashkan Norouzi-Fard, Abbas Bazzi, Ilija Bogunovic, Marwa El Halabi, Ya-Ping Hsieh, Volkan Cevher
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We design the first Efficient bicriteria Submodular Cover Streaming (ESC-Streaming) algorithm for this problem, and provide theoretical guarantees for its performance supported by numerical evidence. In our numerical experiments, we evaluate the performance of ESC-Streaming on active set selection and large-scale graph cover problems. We evaluate the performance of ESC-Streaming on real-world data sets with two applications: active set selection and graph set cover problems, described in section 5. |
| Researcher Affiliation | Academia | Ashkan Norouzi-Fard ashkan.norouzifard@epfl.ch, Abbas Bazzi abbas.bazzi@epfl.ch, Marwa El Halabi marwa.elhalabi@epfl.ch, Ilija Bogunovic ilija.bogunovic@epfl.ch, Ya-Ping Hsieh ya-ping.hsieh@epfl.ch, Volkan Cevher volkan.cevher@epfl.ch. Theory of Computation Laboratory 2 (THL2), EPFL. Laboratory for Information and Inference Systems (LIONS), EPFL |
| Pseudocode | Yes | Algorithm 1 ESC-Streaming Algorithm Picking representative set. Algorithm 2 ESC-Streaming Algorithm Responding to the queries |
| Open Source Code | No | The paper does not explicitly state that source code for the methodology is openly available or provide a link to a repository. |
| Open Datasets | Yes | We evaluate the performance of ESC-Streaming on real-world data sets with two applications: active set selection and graph set cover problems, described in section 5. For active set selection, we choose a dataset having a size that permits the comparison with the offline greedy algorithm. The dataset consists of 7k small organic molecules, each represented by a 276 dimensional vector. For graph cover, we run ESC-Streaming on a large graph of 787 million nodes and 47.6 billion edges. We apply it to the "uk-2014" graph, a large snapshot of the .uk domain taken at the end of 2014 [5, 4, 3]. We include another instance of the Dominating set cover problem on a smaller graph Friendster", an online gaming network [29]. |
| Dataset Splits | No | The paper does not explicitly specify exact percentages or sample counts for training/validation/test splits, nor does it provide citations to predefined splits for its specific experiments. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory sizes) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependency details with version numbers. |
| Experiment Setup | Yes | For active set selection, we apply ESC-Streaming on the log-det function defined in Section 5.1 where we use the Gaussian kernel Kij = exp( kxi xjk2 / 2h2 ), and we set the hyperparameters as in [24]: σ = 1, h = 724. For the Dominating Set problem, we set M = 520 MB, = 2 and Q = 0.7|V |. Similarly, for the Vertex Cover problem, we set M = 320 MB, = 2 and Q = 0.8|E|. |