reproducibilityindex.ai

Precision-Recall Balanced Topic Modelling

Authors: Seppo Virtanen, Mark Girolami

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the proposed approach is effective and infers more coherent topics than existing related approaches.
Researcher Affiliation	Academia	Seppo Virtanen University of Cambridge sjv35@cam.ac.uk Mark Girolami University of Cambridge and The Alan Turing Institute mag92@cam.ac.uk
Pseudocode	No	The paper describes the collapsed Gibbs sampling algorithm but does not present it in a pseudocode block or algorithm environment.
Open Source Code	No	No explicit statement or link providing open-source code for the described methodology was found.
Open Datasets	Yes	We show the model performance for three subsets of publicly available data collections, NYTIMES4, movie reviews5 and 20newsgroup6, as well as for textual product descriptions combined with categorical information that we employ for further evaluations. 4https://archive.ics.uci.edu/ml/datasets/Bag+of+Words 5http://www.cs.cornell.edu/people/pabo/movie-review-data/ 6http://qwone.com/~jason/20Newsgroups/
Dataset Splits	Yes	We sample 1/5 of the documents for each data collection to create a test set containing c M documents.
Hardware Specification	No	No specific hardware specifications (e.g., GPU/CPU models, memory) used for running experiments were mentioned.
Software Dependencies	No	The paper mentions using "R-INLA" but does not specify a version number for it or for any other software dependencies.
Experiment Setup	Yes	We initialise the assignments randomly and set αk = 0.1 and γ = 0.01, corresponding to weakly informative priors, and use 5 x 10^3 sampling steps as burnin. After the burnin we collect posterior averages for S = 200 samples. We infer the models for K = 200 topics and for 21 equi-spaced values between (0, 0.2) for λ, noting that, λ = 0, corresponds to the standard topic model (LDA).