Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Federated Latent Dirichlet Allocation: A Local Differential Privacy Based Framework

Authors: Yansheng Wang, Yongxin Tong, Dingyuan Shi6283-6290

AAAI 2020 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments on three open datasets veriﬁed the effectiveness of our solution.
Researcher Affiliation	Academia	Yansheng Wang, Yongxin Tong, Dingyuan Shi SKLSDE Lab, BDBC, School of Computer Science and Engineering and IRI, Beihang University, China EMAIL
Pseudocode	Yes	Algorithm. 1 shows the details of local sampling. Algorithm. 2 shows the details of global integration. Algorithm. 3 shows our RRP mechanism.
Open Source Code	No	The paper does not provide any statement or link indicating that open-source code for the methodology is available.
Open Datasets	Yes	We use three open datasets: Reviews 2, Emails 3 and Sentiments 4 (Maas et al. 2011). The dataset Emails contains 33,716 spam/non-spam emails with M = 150 and \|V\| = 3309. The dataset Sentiments has 50,000 highly polar movie reviews with positive/negative sentiments, with M = 150 and \|V\| = 22574.
Dataset Splits	Yes	We split training data and test data by 4 : 1 for logistic regression and train both data for 100 iterations with the same solver.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for the experiments. It only mentions general terms like 'computer clusters'.
Software Dependencies	No	The paper does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, specific library versions).
Experiment Setup	Yes	Parameter settings. We randomly sample 1K, 5K and 3K instances respectively from Reviews, Emails and Sentiments for evaluation. The default ϵ is 7.5 for all datasets and the default K is 20 for Reviews, 30 for Emails and 50 for Sentiments.