Graph-Sparse LDA: A Topic Model with Structured Sparsity

Authors: Finale Doshi-Velez, Byron Wallace, Ryan Adams

AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that our Graph-Sparse LDA model finds interpretable, predictive topics on one toy example and two real-world examples from biomedical domains. In each case we compare our model with the state-of-the-art Bayesian nonparametric topic modeling approach LIDA (Archambeau, Lakshminarayanan, and Bouchard 2011). Figures 3a and 3b show the difference in the held-out test likelihoods for the final 50 samples over 20 independent instantiations of the toy problem.
Researcher Affiliation Academia Finale Doshi-Velez Harvard University Cambridge, MA 02138 finale@seas.harvard.edu Byron C Wallace University of Texas at Austin Austin, TX 78701 byron.wallace@utexas.edu Ryan Adams Harvard University Cambridge, MA 02138 rpa@seas.harvard.edu
Pseudocode No In the supplementary materials, we derive a blocked-Gibbs sampler for B, B, A, A, and P (as well as for adding and deleting topics).
Open Source Code No The paper does not provide concrete access to source code for the methodology described, nor does it explicitly state that the code is available.
Open Datasets Yes Autism Spectrum Disorder (ASD) is a complex, heterogenous disease that is often accompanied by many co-occurring conditions such as epilepsy and intellectual disability. We consider a set of 3804 patients with 3626 different diagnoses where the datum Xnw corresponds to the number of times patient n received diagnosis w during the first 15 years of life.2 Diagnoses are organized in a tree-structured hierarchy known as ICD-9CM (Bodenreider 2004). The National Library of Medicine maintains a controlled structured vocabulary of Medical Subject Headings (Me SH) (Lipscomb 2000).
Dataset Splits No A random 1% of each data-set was held out to compute predictive log-likelihoods.
Hardware Specification No The paper does not provide specific hardware details (such as exact GPU/CPU models or memory amounts) used for running its experiments.
Software Dependencies No The paper mentions the models and algorithms used (e.g., LDA, LIDA, Gibbs sampler) but does not provide specific version numbers for any software dependencies or libraries required for replication.
Experiment Setup Yes We ran all samplers for 250 iterations. To reduce burnin, The product AP was initialized using an LDA tensor decomposition (Anandkumar et al. 2012) and then factored into A and P using alternating minimization to find a sparse A that enforced the simplex and ontology constraints.