reproducibilityindex.ai

On Privacy Protection of Latent Dirichlet Allocation Model Training

Authors: Fangyuan Zhao, Xuebin Ren, Shusen Yang, Xinyu Yang

IJCAI 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The experimental results on real-world datasets demonstrate the effectiveness of our proposed algorithms. We conduct experiments on several real-world datasets to demonstrate the effectiveness of our proposed algorithms.
Researcher Affiliation	Academia	1School of Computer Science and Technology, Xi an Jiaotong University, China 2National Engineering Laboratory for Big Data Analytics, Xi an Jiaotong University, China 3Ministry of Education Key Lab For Intelligent Networks and Network Security, Xi an Jiaotong University, China
Pseudocode	Yes	Algorithm 1 Privacy Monitoring for Each Sampling; Algorithm 2 Privacy Monitoring for CGS in LDA
Open Source Code	No	The paper does not include an unambiguous statement or a direct link to the source code for the methodology described in the paper. It only references a full version of the paper on arXiv.
Open Datasets	Yes	The datasets used in our experiment are: KOS1: contains 3430 blog entries from dailykos website. NIPS2: contains 1740 research papers from NIPS conference. Enron3: contains 0.5 million email messages from about 150 users. 1http://archive.ics.uci.edu/ml/ 2http://nips.djvuzone.org/txt.html 3www.cs.cmu.edu/ enron
Dataset Splits	No	The paper mentions training and test sets but does not specify a validation set or any splits for one. "We extracted part of these datasets as our training datasets and the rest as the testsets." Table 1 provides "#. training docs" and "#. test docs" but no validation split.
Hardware Specification	No	The paper does not provide any specific details about the hardware used to run the experiments (e.g., specific GPU/CPU models, memory specifications).
Software Dependencies	No	The paper does not provide specific version numbers for any software components or libraries used in the experiments, which are necessary for reproducibility.
Experiment Setup	Yes	In our experiments, for all datasets, the topic number is set as 50, the maximum iteration number of CGS process in LDA model training is set as 300, which is sufﬁcient for convergence on all three datasets. The hyper parameters α and β are set as 0.1, 0.01, respectively.