Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Hierarchical Compound Poisson Factorization
Authors: Mehmet Basbug, Barbara Engelhardt
ICML 2016 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We compare HCPF with HPF on nine discrete and three continuous data sets and conclude that HCPF captures the relationship between sparsity and response better than HPF. |
| Researcher Affiliation | Academia | Mehmet E. Basbug EMAIL Princeton University, 35 Olden St., Princeton, NJ 08540 USA Barbara E. Engelhardt EMAIL Princeton University, 35 Olden St., Princeton, NJ 08540 USA |
| Pseudocode | Yes | Algorithm 1 SVI for HCPF |
| Open Source Code | No | The paper does not provide a link to its source code or explicitly state that the code is publicly available. |
| Open Datasets | Yes | The rating data sets include amazon fine food ratings (Mc Auley & Leskovec, 2013), movielens (Harper & Konstan, 2015), netflix (Bell & Koren, 2007) and yelp... social media activity data sets (wordpress and tencent) (Niu et al., 2012)... biochemistry data set (merck) (Ma et al., 2015)... echonest (Bertin-Mahieux et al., 2011)... genomics data set (geuvadis) (Lappalainen et al., 2013). |
| Dataset Splits | Yes | We held out 20% and 1% of the non-missing entries for testing (Ytest NM) and validation, respectively. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, or memory). |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | In HCPF, we fix K = 160, ξ = 0.7 and τ = 10, 000 after an empirical study on smaller data sets. To set hyperparameters θ and κ, we use the maximum likelihood estimates of the element distribution parameters on the non-missing entries. ... We then used E[nui] to set the factorization hyperparameters η, ζ, ρ, ϱ, ω, ϖ. To create heavy tails and uninformative gamma priors, we set ϖ = ϱ = 0.1 and ω = ρ = 0.01. |