Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Hierarchical Compound Poisson Factorization

Authors: Mehmet Basbug, Barbara Engelhardt

ICML 2016 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We compare HCPF with HPF on nine discrete and three continuous data sets and conclude that HCPF captures the relationship between sparsity and response better than HPF.
Researcher Affiliation	Academia	Mehmet E. Basbug EMAIL Princeton University, 35 Olden St., Princeton, NJ 08540 USA Barbara E. Engelhardt EMAIL Princeton University, 35 Olden St., Princeton, NJ 08540 USA
Pseudocode	Yes	Algorithm 1 SVI for HCPF
Open Source Code	No	The paper does not provide a link to its source code or explicitly state that the code is publicly available.
Open Datasets	Yes	The rating data sets include amazon ﬁne food ratings (Mc Auley & Leskovec, 2013), movielens (Harper & Konstan, 2015), netﬂix (Bell & Koren, 2007) and yelp... social media activity data sets (wordpress and tencent) (Niu et al., 2012)... biochemistry data set (merck) (Ma et al., 2015)... echonest (Bertin-Mahieux et al., 2011)... genomics data set (geuvadis) (Lappalainen et al., 2013).
Dataset Splits	Yes	We held out 20% and 1% of the non-missing entries for testing (Ytest NM) and validation, respectively.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments (e.g., CPU, GPU models, or memory).
Software Dependencies	No	The paper does not provide specific version numbers for any software dependencies or libraries used in the experiments.
Experiment Setup	Yes	In HCPF, we ﬁx K = 160, ξ = 0.7 and τ = 10, 000 after an empirical study on smaller data sets. To set hyperparameters θ and κ, we use the maximum likelihood estimates of the element distribution parameters on the non-missing entries. ... We then used E[nui] to set the factorization hyperparameters η, ζ, ρ, ϱ, ω, ϖ. To create heavy tails and uninformative gamma priors, we set ϖ = ϱ = 0.1 and ω = ρ = 0.01.