A Tree-Based Method for Fast Repeated Sampling of Determinantal Point Processes

Authors: Jennifer Gillenwater, Alex Kulesza, Zelda Mariet, Sergei Vassilvtiskii

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We now present empirical results demonstrating the performance of our sampling algorithms on both synthetic and real-world data sets.
Researcher Affiliation Collaboration Jennifer Gillenwater 1 Alex Kulesza 1 Zelda Mariet 2 3 Sergei Vassilvitskii 1 1Google Research NYC 2Massachussetts Institute of Technology 3Research performed during an internship at Google.
Pseudocode Yes Algorithm 1 Dual DPP sampling (Kulesza & Taskar, 2012, Algorithm 3) ... Algorithm 2 Tree Construction ... Algorithm 3 Tree-Based Sampling ... Algorithm 4 Personalized Tree-Based Sampling
Open Source Code No The paper does not provide any concrete access information (e.g., specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets Yes we build a realistic movie recommender system based on the Movie Lens dataset (Harper & Konstan, 2016)
Dataset Splits No The paper mentions "held out test data" but does not provide specific details on the training, validation, and test splits (e.g., percentages or counts) needed for reproduction.
Hardware Specification No The paper mentions general statements like "A modern machine can easily handle millions of items" but does not provide specific hardware details (e.g., exact GPU/CPU models, memory amounts) used for running its experiments.
Software Dependencies No The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup Yes We use the resulting movie factor matrix as B (N = 26744, D = 30) to define the DPP kernel... using a threshold of ϵ = 0.4... we use the following simple splitting heuristic: we use the items i, j that maximize bi bj to initialize the left and right subtrees; all other items are then greedily added to the left or right tree in order to minimize their (infinity-norm) distance to the initial item.