Semi-supervised Max-margin Topic Model with Manifold Posterior Regularization

Authors: Wenbo Hu, Jun Zhu, Hang Su, Jingwei Zhuo, Bo Zhang

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments demonstrate that such tight coupling brings significant benefits in quantitative and qualitative performance. We now present the empirical results of our semi-supervised topic model.
Researcher Affiliation Academia Wenbo Hu, Jun Zhu, Hang Su, Jingwei Zhuo, Bo Zhang Tsinghua National Laboratory for Information Science and Technology (TNList), State Key Lab for Intelligent Technology and Systems, Center for Brain-Inspired Computing Research (CBICR), Department of Computer Science and Technology, Tsinghua University, Beijing, 100084, China {hwb13@mails., dcszj@, suhangss@, zhuojw10@mails., dcszb@}tsinghua.edu.cn
Pseudocode No The paper provides detailed mathematical formulations and descriptions of the proposed method, but it does not include a clearly labeled 'Pseudocode' or 'Algorithm' block.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository.
Open Datasets Yes We consider a binary document set which consists of two subgroups of the 20Newsgroups data2, alt.atheism and talk.religion.misc. This sub-dataset consists of 856 training documents and 569 testing documents. ... 2http://qwone.com/ jason/20Newsgroups
Dataset Splits Yes The parameter c2 is the regularization parameter for the manifold regularization which is chosen from {0.1, 0.01, 0.001} via 5-fold cross validation. This sub-dataset consists of 856 training documents and 569 testing documents.
Hardware Specification No The paper discusses training time and efficiency of the models but does not provide specific details about the hardware (e.g., GPU/CPU models, memory) used for running the experiments.
Software Dependencies No The paper mentions 'Ada Grad stepsize [Duchi et al., 2011]' and 'SGLD steps' as part of the optimization process but does not list specific software libraries or their version numbers (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes For LDA-based models, we set α = 1, β = 1 and topic number K = 20. For Med LDA-based models, we set ℓ= 164 and c1 = 1. The parameter c2 is the regularization parameter for the manifold regularization which is chosen from {0.1, 0.01, 0.001} via 5-fold cross validation. The expectation of the topic assignments Z are calculated with 5 samples and for graph construction, we set the nearest neighbor number as 10 for 20Newsgroups dataset and 5 for Yahoo news dataset. ... For the stochastic gradient MCMC, the stepsizes for classifier weights η are Ada Grad stepsize [Duchi et al., 2011] and stepsizes for topic-word parameter Φ are set as 10 (1 + t/100) 0.6 at iteration t.