Streaming Submodular Maximization under a k-Set System Constraint

Authors: Ran Haba, Ehsan Kazemi, Moran Feldman, Amin Karbasi

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We extensively evaluate the empirical performance of our algorithm against the existing work in a series of experiments including finding the maximum independent set in randomly generated graphs, maximizing linear functions over social networks, movie recommendation, Yelp location summarization, and Twitter data summarization.
Researcher Affiliation Collaboration 1Depart. of Mathematics and Computer Science, The Open University of Israel 2Yale Institute for Network Science, Yale University 3Now at Google, Zürich, Switzerland 4Department of Computer Science, University of Haifa, Israel.
Pseudocode Yes Algorithm 1: Non-monotone Data Stream Algorithm; Algorithm 2: Streaming Algorithm for k-set Systems
Open Source Code No The paper does not contain any explicit statements or links indicating that the source code for the methodology described in the paper is publicly available.
Open Datasets Yes For the Erd os Rény graphs we vary in our experiments the probability p that each possible edge is included in the graph (independently from every other edge), and for the Watts-Strogatz graphs we vary the rewiring probability β. The number of nodes is set to n = 2000 in all the graphs, and in the Watts Strogatz model each node is connected to k = 100 nearest neighbors in the ring structure. In the experiment, we use two real-world networks from (Leskovec & Krevl, 2014) as the graph. The dataset for this experiment contains 1793 movies... Yelp Academic Dataset. https://www.kaggle. com/yelp-dataset/yelp-dataset, 2019a. Yelp Dataset. https://www.yelp.com/ dataset, 2019b.
Dataset Splits No The paper describes the datasets and experimental setups but does not explicitly provide details about training, validation, or test splits (e.g., percentages, sample counts, or methodology for creating these splits) needed to reproduce data partitioning.
Hardware Specification No The paper describes the algorithms, experiments, and results, but it does not specify any hardware details such as CPU/GPU models, memory, or computational resources used for running the experiments.
Software Dependencies No The paper does not provide any specific software dependencies or versions (e.g., programming languages, libraries, frameworks, or solvers with version numbers) that would be needed to reproduce the experiments.
Experiment Setup Yes For the Erd os Rény graphs we vary in our experiments the probability p that each possible edge is included in the graph (independently from every other edge), and for the Watts-Strogatz graphs we vary the rewiring probability β. The number of nodes is set to n = 2000 in all the graphs, and in the Watts Strogatz model each node is connected to k = 100 nearest neighbors in the ring structure. The knapsack cost of each edge e = (u, v) E is chosen to be proportional to max(1, du q), where du is the degree of node u in graph G and q = 6, and the costs are normalized so that P e E ce = |V |, where ce represents the knapsack cost of edge e. In the experiment, we use two real-world networks from (Leskovec & Krevl, 2014) as the graph, and vary the knapsack budget between 0 and 1. The user may specify an upper limit m on the number of movies in the set we recommend for them, as well as an upper limit mi on the number of movies from each genre. For simplicity, we use a single value for all mi and refer to this value as the genre limit. In our experiment with this function as the object, we set the genre limit to 20.