On the Affinity, Rationality, and Diversity of Hierarchical Topic Modeling

Authors: Xiaobao Wu, Fengjun Pan, Thong Nguyen, Yichao Feng, Chaoqun Liu, Cong-Duy Nguyen, Anh Tuan Luu

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on benchmark datasets demonstrate that our method surpasses state-of-the-art baselines, effectively improving the affinity, rationality, and diversity of hierarchical topic modeling with better performance on downstream tasks.
Researcher Affiliation Collaboration 1Nanyang Technological University, Singapore 2National University of Singapore, Singapore 3DAMO Academy, Alibaba Group, Singapore
Pseudocode No The paper describes its methods using mathematical formulas and textual explanations, but does not include structured pseudocode or algorithm blocks.
Open Source Code Yes 1Our code is available at https://github.com/bobxwu/Tra Co.
Open Datasets Yes Datasets We experiment with the following benchmark datasets: (i) Neur IPS contains the publications at the Neur IPS conference from 1987 to 2017. (ii) ACL (Bird et al. 2008) is a paper collection from the ACL anthology from 1970 to 2015. (iii) NYT contains news articles of the New York Times with 12 categories. (iv) 20NG (Lang 1995) includes news articles with 20 labels.
Dataset Splits No The paper mentions using benchmark datasets and evaluating results but does not explicitly provide details about train/validation/test splits, percentages, or specific counts, nor does it refer to a standard splitting methodology.
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions 'See more implementation details in the Appendix' but does not include specific software dependencies with version numbers in the provided text.
Experiment Setup No The 'Experiment Setup' section outlines datasets and baselines but does not provide specific hyperparameter values (e.g., learning rate, batch size, epochs) or detailed training configurations for reproducibility.