Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Batch greedy maximization of non-submodular functions: Guarantees and applications to experimental design

Authors: Jayanth Jagalur-Mohan, Youssef Marzouk

JMLR 2021 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate our theoretical findings on synthetic problems and on a real-world climate monitoring example.
Researcher Affiliation Academia Jayanth Jagalur-Mohan EMAIL Youssef Marzouk EMAIL Massachusetts Institute of Technology Cambridge, MA 02139 USA
Pseudocode Yes Algorithm 1 Standard batch greedy algorithm Algorithm 2 Distributed batch greedy algorithm Algorithm 3 Stochastic batch greedy algorithm Algorithm 4 Greedy algorithm using modular lower bounds Algorithm 5 Sequential greedy algorithm for minimizing information loss
Open Source Code No The paper mentions: "The code for s ELM is publicly available, and more details about the E3SM land models can be found in the works by Lu and Ricciuto (2019); Ricciuto et al. (2018)." This refers to a third-party model used in their experiments, not the source code for the methodology presented in this paper.
Open Datasets No The paper uses "synthetic problems" and a "real-world climate monitoring example." For the latter, it states: "Drawing realizations of these parameters yields a simulation ensemble with 2000 samples." This indicates the authors generated data from a model (sELM) rather than using a pre-existing, publicly available dataset that is formally linked or cited for their experiments.
Dataset Splits No The paper describes generating "1000 random instances of the forward operator G" for synthetic problems and using a "simulation ensemble with 2000 samples" from a climate model. However, it does not specify any training, validation, or test dataset splits in the conventional machine learning sense for reproducibility.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU, CPU models, or cloud resources) used for conducting the experiments.
Software Dependencies No The paper mentions the "simplified E3SM land model (s ELM)" and states "The code for s ELM is publicly available." However, it does not provide a specific version number for sELM itself, nor does it list any other software libraries or tools with their respective version numbers that were used in the experimental setup.
Experiment Setup Yes In Section 5.1, the paper states: "The dimension of the parameters X is set to n = 20, while cardinality of the candidate set of observations Y is fixed at m = 100." It also specifies "correlation lengths 0.105 and 0.021" for the prior and observation error covariances, and that "We draw 1000 random instances of the forward operator G." For algorithm parameters, it mentions considering "seven different batch sizes, corresponding to q {1%, 10%, 20%, 30%, 40%, 50%, 100%}."