Expectation-Maximization for Learning Determinantal Point Processes
Authors: Jennifer A Gillenwater, Alex Kulesza, Emily B. Fox, Ben Taskar
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We test our method on a real-world product recommendation task, and achieve relative gains of up to 16.5% in test log-likelihood compared to the naive approach of maximizing likelihood by projected gradient ascent on the entries of the kernel matrix. |
| Researcher Affiliation | Academia | Jennifer Gillenwater Computer and Information Science University of Pennsylvania jengi@cis.upenn.edu Alex Kulesza Computer Science and Engineering University of Michigan kulesza@umich.edu Emily Fox Statistics University of Washington ebfox@stat.washington.edu Ben Taskar Computer Science and Engineering University of Washington taskar@cs.washington.edu |
| Pseudocode | Yes | Algorithm 1 K-Ascent (KA) and Algorithm 2 Expectation-Maximization (EM) are presented. |
| Open Source Code | Yes | Code and data for all experiments can be downloaded from https://code.google.com/p/em-for-dpps |
| Open Datasets | No | To test our DPP learning algorithms, we collected a dataset consisting of 29,632 baby registries from Amazon.com, filtering out those listing fewer than 5 or more than 100 products. |
| Dataset Splits | No | The paper states: "We used 70% of the data for training and 30% for testing." It does not specify a validation split. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers. |
| Experiment Setup | Yes | The paper describes experimental setup details such as two initialization types (Wishart distribution and moments-matching), a 70% training and 30% testing data split, and data filtering criteria (e.g., 'filtering out those listing fewer than 5 or more than 100 products', 'filtered down to its top 100 most frequent items'). |