Content-based recommendations with Poisson factorization
Authors: Prem Gopalan, Laurent Charlin, David Blei
NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate empirically that our model outperforms several baselines, including the previous state-of-the art approach. |
| Researcher Affiliation | Academia | Prem Gopalan Department of Computer Science Princeton University Princeton, NJ 08540 pgopalan@cs.princeton.edu Laurent Charlin Department of Computer Science Columbia University New York, NY 10027 lcharlin@cs.columbia.edu David M. Blei Departments of Statistics & Computer Science Columbia University New York, NY 10027 david.blei@columbia.edu |
| Pseudocode | Yes | Figure 2: The CTPF coordinate ascent algorithm. |
| Open Source Code | Yes | Our source code is available from: https://github.com/premgopalan/collabtm |
| Open Datasets | Yes | Data sets. We study the CTPF algorithm of Figure 2 on two data sets. The Mendeley data set [13] of scientific articles is a binary matrix of 80,000 users and 260,000 articles with 5 million observations. Each cell corresponds to the presence or absence of an article in a scientist s online library. The ar Xiv data set is a matrix of 120,297 users and 825,707 articles, with 43 million observations. Each observation indicates whether or not a user has consulted an article (or its abstract). This data was collected from the access logs of registered users on the http://ar Xiv.org paper repository. |
| Dataset Splits | Yes | Additionally, we set aside 1% of the training ratings as a validation set (20% for ar Xiv) and use it to determine convergence. |
| Hardware Specification | No | No specific hardware details (e.g., GPU/CPU models, memory) used for running the experiments are mentioned. |
| Software Dependencies | No | The paper states the implementation is in C++ but does not provide specific software dependencies with version numbers (e.g., libraries or solvers). |
| Experiment Setup | Yes | Following [9], we fix each Gamma shape and rate hyperparameter at 0.3. We initialize the variational parameters for ηuk and ϵdk to the prior on the corresponding latent variables and add small uniform noise. We set learning rate parameters τ0 = 1024, κ = 0.5 and use a mini-batch size of 1024. We set K = 100 in all of our experiments. |