Stochastic Submodular Maximization: The Case of Coverage Functions

Authors: Mohammad Karimi, Mario Lucic, Hamed Hassani, Andreas Krause

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the practical utility of the proposed framework in an extensive experimental evaluation. We show for the first time that continuous optimization is a highly practical, scalable avenue for maximizing submodular set functions. We demonstrate the practical utility of the proposed framework and compare it to standard baselines. We compare the performance of the algorithms in terms of their wall-clock running time and the obtained utility.
Researcher Affiliation Academia Mohammad Reza Karimi Department of Computer Science ETH Zurich mkarimi@ethz.ch Mario Lucic Department of Computer Science ETH Zurich lucic@inf.ethz.ch Hamed Hassani Department of Electrical and Systems Engineering University of Pennsylvania hassani@seas.upenn.edu Andreas Krause Department of Computer Science ETH Zurich krausea@ethz.ch
Pseudocode Yes Algorithm 1 Stochastic Submodular Maximization via concave relaxation Require: matroid M with base polytope P, t (step size), T (maximum # of iterations) 1: x(0) starting point in P 2: for t 0 to T 1 do 3: Choose gt at random from a distribution such that E[gt|x(0), . . . , x(t)] 2 @ F(x(t)) 4: x(t+1/2) x(t) + t gt 5: x(t+1) Project P(x(t+1/2)) 6: end for 7: x T 1 8: S RANDOMIZED-PIPAGE-ROUND( x T ) 9: return S such that S 2 M, E[f(S)] (1 1/e)f(OPT) “(T).
Open Source Code No The paper does not provide an explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes Influence Maximization for the Epinions network2. The network consists of 75 879 nodes and 508 837 directed edges. We consider the subgraph induced by the top 10 000 nodes with the largest out-degree and use the independent cascade model [20]. Facility Location for Blog Selection. We use the data set used in [14], consisting of 45 193 blogs, and 16 551 cascades. Exemplar-based Clustering on CIFAR-10. The data set contains 60 000 color images with resolution 32 32. We use a single batch of 10 000 images and compare our algorithms to variants of GREEDY over the full data set.
Dataset Splits No The paper does not provide specific train/validation/test dataset splits. It mentions using 'a single batch of 10 000 images' and comparing 'over the full data set' for CIFAR, and 'empirical average of s samples' for Influence Maximization, but no explicit splits.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU models, CPU types, memory amounts) used for running the experiments.
Software Dependencies No The paper mentions using techniques like SGD, ADAGRAD, and ADAM but does not list specific software libraries or their version numbers used for implementation.
Experiment Setup Yes We set this probability p to be 0.02, and chose the number of seeds k = 50. We set k = 100. We use the Euclidean norm as the distance function and set k = 50. For STOCHASTIC-GREEDY, we vary the parameter " in order to explore the running time/utility tradeoff. We note that the number of samples suggested by Proposition 3 is overly conservative, and instead we make a practical choice of s = 10^3 samples.