Scalable inference of topic evolution via models for latent geometric structures

Authors: Mikhail Yurochkin, Zhiwei Fan, Aritra Guha, Paraschos Koutris, XuanLong Nguyen

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We study ability of our models to learn the latent temporal dynamics and discover new topics that change over time. Next we show that our models scale well by utilizing temporal and group inherent data structures. We also study hyperparameters choices. We analyze two datasets: the Early Journal Content (http://www.jstor.org/dfr/about/sample-datasets), and a collection of Wikipedia articles partitioned by categories and in time according to their popularity.
Researcher Affiliation Collaboration Mikhail Yurochkin IBM Research mikhail.yurochkin@ibm.com Zhiwei Fan University of Wisconsin-Madison zhiwei@cs.wisc.edu Aritra Guha University of Michigan aritra@umich.edu Paraschos Koutris University of Wisconsin-Madison paris@cs.wisc.edu Xuan Long Nguyen University of Michigan xuanlong@umich.edu
Pseudocode Yes Algorithm 1 Streaming Dynamic Matching (SDM)
Open Source Code Yes Code: https://github.com/moonfolk/SDDM
Open Datasets Yes We analyze two datasets: the Early Journal Content (http://www.jstor.org/dfr/about/sample-datasets), and a collection of Wikipedia articles partitioned by categories and in time according to their popularity. Dataset construction details are given in the Supplement.
Dataset Splits No The paper mentions setting aside data for 'testing purposes' and evaluating 'perplexity scores on the held out data', implying a training set. However, it does not provide explicit details about training/validation/test splits, such as specific percentages, sample counts, or a clear three-way partitioning methodology for reproduction.
Hardware Specification No The paper mentions '20 cores' in Table 1 for DM and SDDM models but does not provide specific hardware details such as CPU/GPU models, memory, or other detailed system specifications used for running the experiments.
Software Dependencies No The paper does not provide specific version numbers for any software dependencies or libraries used in the implementation or experimentation.
Experiment Setup Yes In the preceding experiments we set τ0 = 2, τ1 = 1, γ0 = 1 for SDM; τ1 = 2, γ0 = 1 for DM; τ0 = 4, τ1 = 2, γ0 = 2 for SDDM.