Scalable Sampling for Nonsymmetric Determinantal Point Processes
Authors: Insu Han, Mike Gartrell, Jennifer Gillenwater, Elvis Dohmatob, amin karbasi
ICLR 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In our experiments we compare the speed of all of these samplers for a variety of real-world tasks. In Table 2, we observe that the predictive performance of our ONDPP models generally match or sometimes exceed the baseline. |
| Researcher Affiliation | Collaboration | Insu Han Yale University insu.han@yale.edu Mike Gartrell Criteo AI Lab m.gartrell@criteo.com Jennifer Gillenwater Google Research jengi@google.com Elvis Dohmatob Facebook AI Research dohmatob@fb.com Amin Karbasi Yale University amin.karbasi@yale.edu |
| Pseudocode | Yes | Algorithm 1 Cholesky-based NDPP sampling (Poulson, 2019, Algorithm 1), Algorithm 2 Rejection NDPP sampling (Tree-based sampling), Algorithm 3 Tree-based DPP sampling (Gillenwater et al., 2019), Algorithm 4 Youla decomposition of low-rank skew-symmetric matrix |
| Open Source Code | Yes | All of the code implementing our constrained learning and sampling algorithms is publicly available . The proofs for our theoretical contributions are available in Appendix E. For our experiments, all dataset processing steps, experimental procedures, and hyperparameter settings are described in Appendices A, B, and C, respectively. (and footnote: https://github.com/insuhan/nonsymmetric-dpp-sampling) |
| Open Datasets | Yes | UK Retail: This dataset (Chen et al., 2012) contains baskets representing transactions from an online retail company that sells all-occasion gifts., Recipe: This dataset (Majumder et al., 2019) contains recipes and food reviews from Food.com (formerly Genius Kitchen) ., Instacart: This dataset (Instacart, 2017) contains baskets purchased by Instacart users ., Million Song: This dataset (Mc Fee et al., 2012) contains playlists ( baskets ) of songs from Echo Nest users ., Book: This dataset (Wan & Mc Auley, 2018) contains reviews from the Goodreads book review website, including a variety of attributes describing the items***. (and associated footnotes with URLs for Recipe, Instacart, Million Song, Book datasets) |
| Dataset Splits | Yes | We use 300 randomly-selected baskets as a held-out validation set, for tracking convergence during training and for tuning hyperparameters. Another 2000 random baskets are used for testing, and the rest are used for training. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory, or cloud instance types) used for running the experiments. |
| Software Dependencies | No | We use Py Torch with Adam (Kingma & Ba, 2015) for optimization. and We use Pytorch’s linalg.solve to avoid the expense of explicitly computing the (B B) 1 inverse. No specific version numbers are provided for PyTorch or other libraries. |
| Experiment Setup | Yes | We perform a grid search using a held-out validation set to select the best-performing hyperparameters for each model and dataset. The hyperparameter settings used for each model and dataset are described below. For all of the above model configurations and datasets, we use a batch size of 800 during training. |