Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Revisiting stochastic submodular maximization with cardinality constraint: A bandit perspective
Authors: Pratik Jawanpuria, Bamdev Mishra, Karthik S. Gurumoorthy
TMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results illustrate the efficacy of the proposed approach on applications such as exemplar based clustering and representative sampling from a target set. The proofs of all theoretical results and additional experimental details are provided in the appendix. In this section, we show the benefit of the proposed BSG and BG algorithms on the exemplar-based clustering and representative sampling applications. |
| Researcher Affiliation | Industry | Pratik Jawanpuria EMAIL Microsoft, India; Bamdev Mishra EMAIL Microsoft, India; Karthik S. Gurumoorthy EMAIL Walmart Global Tech, India |
| Pseudocode | Yes | Algorithm 1 Proposed bandit greedy (BG) and bandit stochastic greedy (BSG) algorithms; Algorithm 2 Approximate best arm (ABA) algorithm; Algorithm 3 Naive elimination algorithm; Algorithm 4 Aggressive elimination algorithm |
| Open Source Code | No | The authors C++ code link: https://github.com/motiwari/Bandit PAM. This link is for a baseline method (Bandit PAM) used in the experiments, not for the authors' own proposed methodology. The paper states: All the algorithms (except Bandit PAM) are implemented in Matlab. No code for the proposed methods is provided. |
| Open Datasets | Yes | MNIST (Le Cun et al., 1998) is a handwritten digits dataset... Tiny Image Net (TIN) is a smaller version of the Image Net dataset (Russakovsky et al., 2015) with 200 classes and 500 instances per class (Wu et al., 2017). |
| Dataset Splits | Yes | MNIST (Le Cun et al., 1998) is a handwritten digits dataset with 28 28 pixels greyscale images of 10 classes for digits {0, 1, . . . 9}. It has two different sets of 60, 000 samples (train) and 10, 000 samples (test). The source set consists of 5,000 points uniformly sampled from the MNIST test set (of size 10,000). The target set is constructed from the MNIST train set consisting of 60,000 points such that one class has a skewed r% representation and others have (100 r)/9% representation each. |
| Hardware Specification | Yes | The experiments are run on Intel Xeon CPU (3.6 GHz) with 6 cores and 64 GB RAM. |
| Software Dependencies | No | All the algorithms (except Bandit PAM) are implemented in Matlab. This statement mentions the software but does not specify a version number for Matlab or any other libraries used by the authors' proposed algorithms. |
| Experiment Setup | Yes | For BSG-NE, BG-NE, and SCG, we experiment with ν = {5, 10, 15}. ... We set δ = 0.001 for BSG-NE and BG-NE and ϵ = 0.01 for BSG-NE. We experiment with LSG in two settings: ϵ = 0.01 and ϵ = 0.1. For Bandit PAM, we show results with δ = {0.01, 0.001, 2k/n}. |