Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Determining the Number of Latent Factors in Statistical Multi-Relational Learning

Authors: Chengchun Shi, Wenbin Lu, Rui Song

JMLR 2019 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Simulations and real data examples show that our proposed information criteria have good ﬁnite sample properties. Section 4. Numerical Experiments. In Section 4.2, we introduce our algorithm for computing the maximum likelihood estimators of a logistic RESCAL model. Simulation studies are presented in Section 4.3. In Section 4.4, we apply the proposed information criteria to a real dataset. Tables 1, 2, and 3 report numerical results from these experiments.
Researcher Affiliation	Academia	Chengchun Shi EMAIL Wenbin Lu EMAIL Rui Song EMAIL Department of Statistics North Carolina State University Raleigh, NC 27695, USA
Pseudocode	Yes	4.1. Implementation In this section, we propose an algorithm for computing {ba(s) i }i and { b R(s) k }k. The algorithm is based upon a 3-block alternating direction method of multipliers (ADMM). ... Applying dual descent method yields the following steps, with l denotes the iteration number: {a(s) i,l+1}n i=s+1 = arg min {a(s) i }n i=s+1 Lρ(...) (11) {R(s) k,l+1}K k=1 = arg min {R(s) k }K k=1 Lρ(...) (12) {b(s) i,l+1}n i=s+1 = arg min {b(s) i }n i=s+1 Lρ(...) (13) v(s) i,l+1 = v(s) i,l + a(s) i,l b(s) i,l , i = s + 1, . . . , n.
Open Source Code	No	The paper states: "The ADMM algorithm proposed in Section 4.1 is implemented in R. Some subroutines of the algorithm are written in C with the GNU Scientiﬁc Library (GSL, Galassi et al., 2015) to facilitate the computation." However, it does not explicitly state that the authors' implementation code is open-source, nor does it provide a link to a code repository.
Open Datasets	Yes	In this section, we apply the proposed information criteria to the Social Evolution dataset (Madan et al., 2012).
Dataset Splits	Yes	For any s [1, . . . , 12], we randomly select 80% of the observations and estimate {ba(s) i }i and { b R(s) k } by maximizing the observed likelihood function based on these training samples. Then we compute bπijk = exp{(ba(s) i )T b R(s) k ba(s) j } 1 + exp{(ba(s) i )T b R(s) k ba(s) j } . Based on these predicted probabilities, we calculate the area under the precision-recall curve (AUC) on the remaining 20% testing samples.
Hardware Specification	No	The paper mentions that the algorithm is implemented in R and uses C subroutines with GSL, but it does not provide any specific details about the hardware (e.g., CPU, GPU, memory) used for running the experiments or simulations.
Software Dependencies	Yes	The ADMM algorithm proposed in Section 4.1 is implemented in R. Some subroutines of the algorithm are written in C with the GNU Scientiﬁc Library (GSL, Galassi et al., 2015) to facilitate the computation.
Experiment Setup	Yes	In our implementation, we set ρ = n K/2. We simulate {Yijk}ijk from the following model: Pr(Yijk = 1\|{ai}i, {Rk}k) = exp(a T i Rkaj) / (1 + exp(a T i Rkaj)), a1, a2, . . . , an iid N(0, 1), R1 = R2 = = RK = diag(1, 1, 1, 1, . . . , 1, 1 \| {z } s0 ). We consider six simulation settings. In the first three settings, we fix K = 3 and set n = 100, 150 and 200, respectively. In the last three settings, we increase K to 10, 20, 50, and set n = 50. In each setting, we further consider three scenarios, by setting s0 = 2, 4 and 8. Let smax = 12. In ICα, we set α = 0, 0.5 and 1.