Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

How Do You Want Your Greedy: Simultaneous or Repeated?

Authors: Moran Feldman, Christopher Harshaw, Amin Karbasi

JMLR 2023 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Finally, we test these algorithms on real datasets.
Researcher Affiliation	Academia	Moran Feldman EMAIL Department of Computer Science University of Haifa Haifa, Israel Christopher Harshaw EMAIL Simons Institute Univeristy of California, Berkeley Berkeley, CA, USA Amin Karbasi EMAIL Departments of Electrical Engeering, Computer Science, Statsitics & Data Science Yale University New Haven, CT, USA
Pseudocode	Yes	Algorithm 1: Simultaneous Greedys (N, f, I, ℓ)
Open Source Code	Yes	We also present Submodular Greedy.jl, a Julia package which implements these algorithms. Our final main contribution in this paper is Submodular Greedy.jl, an open source Julia package which implements the simultaneous and repeated greedy algorithms described here, along with their nearly linear time and knapsack variants. We have written the package so that it is easy to use out-of-the-box , requiring little to no knowledge of the algorithmic variants such as marginal gain thresholding and density ratio thresholding. The package is available at this URL5 https://github.com/crharshaw/Submodular Greedy.jl and the installation requires only one line of code using the Julia package manager.
Open Datasets	Yes	In our experiments, we use data from the Movie Lens 20M Dataset, which features 20 million ratings of 27,000 movies by 138,000 users. For each movie, we construct a corresponding feature vector vi by using a low-rank matrix completion technique on the user reviews, as proposed by Lindgren et al. (2015).
Dataset Splits	No	The paper describes how the dataset is used to create problem instances based on genre limits and rating budgets but does not specify any explicit training/test/validation splits for the data itself.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for running the experiments.
Software Dependencies	No	Our final main contribution in this paper is Submodular Greedy.jl, an open source Julia package which implements the simultaneous and repeated greedy algorithms described here, along with their nearly linear time and knapsack variants.
Experiment Setup	Yes	We define the genre-limiting constraint set (I, N), where S I if \|{e S : g Ge}\| dg for each genre g G. We choose the genre fraction limits of most genres to reflect the total fraction of movies belonging to the genre, i.e., qg = \|{e N : g Ge}\|/n. The exceptions are Crime, Drama and Thriller which have slightly higher genre fraction limits and Animation, Children, Romance, and Horror which have slightly lower genre fraction limits. where λ [0, 1] is a user-defined penalty term. For both Density Search SGS and Density Search RG, we set the number of solutions to ℓ= 2 and we use values δ = ε {0.1, 0.01}.