Exact sampling of determinantal point processes with sublinear time preprocessing

Authors: Michal Derezinski, Daniele Calandriello, Michal Valko

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we experimentally evaluate the performance of DPP-VFX and compare it to exact sampling [HKP+06] and MCMC-based approaches [AGR16]. In particular, we are only interested in evaluating computational performance, i.e., how DPP-VFX and baselines scale with the size n of the matrix L. We use the infinite MNIST digits dataset (i.e., MNIST8M [LCB07]) as input data, where n varies up to 106 and d = 784.
Researcher Affiliation Collaboration Michał Derezi nski Department of Statistics University of California, Berkeley mderezin@berkeley.edu Daniele Calandriello LCSL Istituto Italiano di Tecnologia, Italy daniele.calandriello@iit.it Michal Valko Deep Mind Paris valkom@deepmind.com
Pseudocode Yes Algorithm 1 DPP-VFX sampling S DPP(L) Input: L 2 Rn n, its Nyström approximation b L, q > 0 1: Compute li = (L b L) + b L(I + b L) 1 ii Pr(i 2 S) 2: Initialize s = P i li, z = tr %b L(I + b L) 1& ij 3: repeat 4: sample t Poisson(q es/q) 5: sample σ1, ..., σt i.i.d. ( l1 s ) 6: sample Acc Bernoulli ez det(I+e Lσ) ets/q det(I+b L) 7: until Acc = true 8: sample e S DPP 9: return S = {σi : i2 e S}
Open Source Code Yes Our implementation of DPP-VFX is provided at https://github.com/guilgautier/DPPy/.
Open Datasets Yes We use the infinite MNIST digits dataset (i.e., MNIST8M [LCB07]) as input data, where n varies up to 106 and d = 784.
Dataset Splits No The paper mentions using subsets of MNIST8M for experiments but does not provide specific train/validation/test split percentages, sample counts, or explicit methodology for data partitioning.
Hardware Specification Yes All experiments are carried out on a 24-core CPU and fully take advantage of potential parallelization. We gratefully acknowledge the support of NVIDIA Corporation for the donation of the Titan Xp GPUs and the Tesla k40 GPU used for this research.
Software Dependencies No The paper states that "All algorithms, including DPP-VFX, are implemented in python as part of the DPPy library [GBV19]", but it does not provide specific version numbers for Python or the DPPy library.
Experiment Setup Yes For the Nyström approximation we set m = 10deff(1) 10k. Following the strategy of Section 4, for each algorithm we control the size of the output set by rescaling the input matrix L by a constant ? such that E[|S ?|] = 10.