Stochastic Submodular Maximization: The Case of Coverage Functions
Authors: Mohammad Karimi, Mario Lucic, Hamed Hassani, Andreas Krause
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the practical utility of the proposed framework in an extensive experimental evaluation. We show for the first time that continuous optimization is a highly practical, scalable avenue for maximizing submodular set functions. We demonstrate the practical utility of the proposed framework and compare it to standard baselines. We compare the performance of the algorithms in terms of their wall-clock running time and the obtained utility. |
| Researcher Affiliation | Academia | Mohammad Reza Karimi Department of Computer Science ETH Zurich mkarimi@ethz.ch Mario Lucic Department of Computer Science ETH Zurich lucic@inf.ethz.ch Hamed Hassani Department of Electrical and Systems Engineering University of Pennsylvania hassani@seas.upenn.edu Andreas Krause Department of Computer Science ETH Zurich krausea@ethz.ch |
| Pseudocode | Yes | Algorithm 1 Stochastic Submodular Maximization via concave relaxation Require: matroid M with base polytope P, t (step size), T (maximum # of iterations) 1: x(0) starting point in P 2: for t 0 to T 1 do 3: Choose gt at random from a distribution such that E[gt|x(0), . . . , x(t)] 2 @ F(x(t)) 4: x(t+1/2) x(t) + t gt 5: x(t+1) Project P(x(t+1/2)) 6: end for 7: x T 1 8: S RANDOMIZED-PIPAGE-ROUND( x T ) 9: return S such that S 2 M, E[f(S)] (1 1/e)f(OPT) “(T). |
| Open Source Code | No | The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | Influence Maximization for the Epinions network2. The network consists of 75 879 nodes and 508 837 directed edges. We consider the subgraph induced by the top 10 000 nodes with the largest out-degree and use the independent cascade model [20]. Facility Location for Blog Selection. We use the data set used in [14], consisting of 45 193 blogs, and 16 551 cascades. Exemplar-based Clustering on CIFAR-10. The data set contains 60 000 color images with resolution 32 32. We use a single batch of 10 000 images and compare our algorithms to variants of GREEDY over the full data set. |
| Dataset Splits | No | The paper does not provide specific train/validation/test dataset splits. It mentions using 'a single batch of 10 000 images' and comparing 'over the full data set' for CIFAR, and 'empirical average of s samples' for Influence Maximization, but no explicit splits. |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory amounts) used for running the experiments. |
| Software Dependencies | No | The paper mentions using techniques like SGD, ADAGRAD, and ADAM but does not list specific software libraries or their version numbers used for implementation. |
| Experiment Setup | Yes | We set this probability p to be 0.02, and chose the number of seeds k = 50. We set k = 100. We use the Euclidean norm as the distance function and set k = 50. For STOCHASTIC-GREEDY, we vary the parameter " in order to explore the running time/utility tradeoff. We note that the number of samples suggested by Proposition 3 is overly conservative, and instead we make a practical choice of s = 10^3 samples. |