Sampling from a k-DPP without looking at all items

Authors: Daniele Calandriello, Michal Derezinski, Michal Valko

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we evaluate our -DPP sampler on a benchmark introduced by [13] (see Appendix D). The benchmark uses subsets of the infinite MNIST dataset [33] with d = 784 and n varying up to 10^6. All experiments are executed on a 28 core Xeon E5-2680 v4. Results We begin by reporting results on a smaller subset of data (Figure 1) where even the nonefficient samplers can be run.
Researcher Affiliation Collaboration Daniele Calandriello Deep Mind Paris dcalandriello@google.com Michał Derezi ński University of California, Berkeley mderezin@berkeley.edu Michal Valko Deep Mind Paris valkom@deepmind.com
Pseudocode Yes Algorithm 1 -DPP sampler; Algorithm 2 Binary search for initializing the k-DPP(L) sampler
Open Source Code Yes Our implementation of -DPP is provided at https://github.com/guilgautier/DPPy/.
Open Datasets Yes The benchmark uses subsets of the infinite MNIST dataset [33] with d = 784 and n varying up to 10^6.
Dataset Splits No The paper mentions using the MNIST dataset but does not explicitly provide details on training, validation, or test dataset splits, percentages, or sample counts for reproduction.
Hardware Specification Yes All experiments are executed on a 28 core Xeon E5-2680 v4.
Software Dependencies No The paper mentions that algorithms are implemented in python as part of DPPy [20] but does not provide specific version numbers for Python or DPPy.
Experiment Setup Yes We use an rbf similarity with σ = 3d, and set k = 10 to match the number of digit classes in MNIST.