reproducibilityindex.ai

Scalable MCMC Sampling for Nonsymmetric Determinantal Point Processes

Authors: Insu Han, Mike Gartrell, Elvis Dohmatob, Amin Karbasi

ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	With both a theoretical analysis and experiments on realworld datasets, we verify that our scalable approximate sampling algorithms are orders of magnitude faster than existing sampling approaches for k-NDPPs and NDPPs.
Researcher Affiliation	Collaboration	Insu Han 1 Mike Gartrell 2 Elvis Dohmatob 3 Amin Karbasi 1 1Yale University 2Criteo AI Lab, Paris, France 3Facebook AI Lab, Paris, France. Correspondence to: Insu Han <insu.han@yale.edu>, Mike Gartrell <m.gartrell@criteo.com>.
Pseudocode	Yes	Algorithm 1 MCMC Sampling for k-NDPP; Algorithm 2 Up Operator via Rejection Sampling; Algorithm 3 Tree-based k-DPP Sampling; Algorithm 4 MCMC Sampling for NDPP
Open Source Code	Yes	The source code for our NDPP sampling algorithms is publicly available at https://github.com/insuhan/ndpp-mcmc-sampling.
Open Datasets	Yes	UK Retail: This dataset (Chen et al., 2012); Recipe: This dataset (Majumder et al., 2019); Instacart: This dataset (Instacart, 2017); Million Song: This dataset (Mc Fee et al., 2012); Book: This dataset (Wan & Mc Auley, 2018)
Dataset Splits	Yes	We use the training scheme from (Han et al., 2022), where 300 randomly-selected baskets are held-out as a validation set for tracking convergence during training, another 2000 random subsets are used for testing, and the remaining baskets are used for training.
Hardware Specification	No	No specific hardware details (e.g., CPU/GPU models, memory, or specific cloud instances) were explicitly provided for running the experiments.
Software Dependencies	No	The paper mentions using 'Adam optimizer (Kingma & Ba, 2015)' but does not specify software versions for libraries or programming languages.
Experiment Setup	Yes	We use the Adam optimizer (Kingma & Ba, 2015); we initialize D from N(0, 1), and V and B are initialized from the U([0, 1]). We set α = β = 0.01 for all datasets.