Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Layer-Assisted Neural Topic Modeling over Document Networks
Authors: Yiming Wang, Ximing Li, Jihong Ouyang
IJCAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results validate that LANTM significantly outperforms the existing models on topic quality, text classification and link prediction. |
| Researcher Affiliation | Academia | Yiming Wang1,2 , Ximing Li1,2 , Jihong Ouyang1,2 1College of Computer Science and Technology, Jilin University, China 2Key Laboratory of Symbolic Computation and Knowledge Engineering of Ministry of Education, Jilin University, China |
| Pseudocode | Yes | Algorithm 1 Training process for LANTM |
| Open Source Code | No | The paper provides GitHub links for several baseline models (e.g., NVDM, Prod LDA, ETM) in footnotes, but it does not provide a link or an explicit statement about the availability of the source code for their proposed LANTM model. |
| Open Datasets | Yes | In the experiments, we apply the dataset of Cora2 consisting of paper abstracts and citations [Mc Callum et al., 2000], and Reuters3 (R8) without any links. ...2http://people.cs.umass.edu/mccallum/data/cora-classify.tar.gz 3https://martin-thoma.com/nlp-reuters/ |
| Dataset Splits | Yes | In both transductive and inductive settings, we conduct 5-fold cross-validation experiments, and report the average scores of Micro-F1 and Macro-F1 in Table 3. ... Table 1: #doc #train #test |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions software like Adam optimizer, SVMs classifier (implying scikit-learn), and Palmetto for coherence measurement, but it does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | Following [Kipf and Welling, 2016], we set three layers for both channels of MLP and GCN. ... For our LANTM, the combining coefficient ξ is tuned over {0.1, 0.2, . . . , 0.9}. For all baseline models, the default parameters are adopted. All methods are trained under same num of epochs and the topic numbers are set as {25, 50} for all datasets. ... For the topic number K, we vary it from the set of {25, 50, 75, 100, 125}. For combining coefficient ξ, we vary it from an increasing set {0.1, 0.2, . . . , 0.9}. |