Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Federated Latent Dirichlet Allocation: A Local Differential Privacy Based Framework
Authors: Yansheng Wang, Yongxin Tong, Dingyuan Shi6283-6290
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on three open datasets veri๏ฌed the effectiveness of our solution. |
| Researcher Affiliation | Academia | Yansheng Wang, Yongxin Tong, Dingyuan Shi SKLSDE Lab, BDBC, School of Computer Science and Engineering and IRI, Beihang University, China EMAIL |
| Pseudocode | Yes | Algorithm. 1 shows the details of local sampling. Algorithm. 2 shows the details of global integration. Algorithm. 3 shows our RRP mechanism. |
| Open Source Code | No | The paper does not provide any statement or link indicating that open-source code for the methodology is available. |
| Open Datasets | Yes | We use three open datasets: Reviews 2, Emails 3 and Sentiments 4 (Maas et al. 2011). The dataset Emails contains 33,716 spam/non-spam emails with M = 150 and |V| = 3309. The dataset Sentiments has 50,000 highly polar movie reviews with positive/negative sentiments, with M = 150 and |V| = 22574. |
| Dataset Splits | Yes | We split training data and test data by 4 : 1 for logistic regression and train both data for 100 iterations with the same solver. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, memory) used for the experiments. It only mentions general terms like 'computer clusters'. |
| Software Dependencies | No | The paper does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x, specific library versions). |
| Experiment Setup | Yes | Parameter settings. We randomly sample 1K, 5K and 3K instances respectively from Reviews, Emails and Sentiments for evaluation. The default ฯต is 7.5 for all datasets and the default K is 20 for Reviews, 30 for Emails and 50 for Sentiments. |