Sampling from a k-DPP without looking at all items
Authors: Daniele Calandriello, Michal Derezinski, Michal Valko
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we evaluate our -DPP sampler on a benchmark introduced by [13] (see Appendix D). The benchmark uses subsets of the infinite MNIST dataset [33] with d = 784 and n varying up to 10^6. All experiments are executed on a 28 core Xeon E5-2680 v4. Results We begin by reporting results on a smaller subset of data (Figure 1) where even the nonefficient samplers can be run. |
| Researcher Affiliation | Collaboration | Daniele Calandriello Deep Mind Paris dcalandriello@google.com Michał Derezi ński University of California, Berkeley mderezin@berkeley.edu Michal Valko Deep Mind Paris valkom@deepmind.com |
| Pseudocode | Yes | Algorithm 1 -DPP sampler; Algorithm 2 Binary search for initializing the k-DPP(L) sampler |
| Open Source Code | Yes | Our implementation of -DPP is provided at https://github.com/guilgautier/DPPy/. |
| Open Datasets | Yes | The benchmark uses subsets of the infinite MNIST dataset [33] with d = 784 and n varying up to 10^6. |
| Dataset Splits | No | The paper mentions using the MNIST dataset but does not explicitly provide details on training, validation, or test dataset splits, percentages, or sample counts for reproduction. |
| Hardware Specification | Yes | All experiments are executed on a 28 core Xeon E5-2680 v4. |
| Software Dependencies | No | The paper mentions that algorithms are implemented in python as part of DPPy [20] but does not provide specific version numbers for Python or DPPy. |
| Experiment Setup | Yes | We use an rbf similarity with σ = 3d, and set k = 10 to match the number of digit classes in MNIST. |