Dirichlet Simplex Nest and Geometric Inference

Authors: Mikhail Yurochkin, Aritra Guha, Yuekai Sun, Xuanlong Nguyen

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The effectiveness of our model and the learning algorithm is demonstrated by simulations and by analyses of text and financial data.Section 5 presents an exhaustive comparative study on simulated and real data.
Researcher Affiliation Collaboration 1IBM Research, Cambridge 2MIT-IBM Watson AI Lab 3Department of Statistics, University of Michigan.
Pseudocode Yes Algorithm 1 Voronoi Latent Admixture (VLAD)
Open Source Code Yes 1Code: https://github.com/moonfolk/VLAD
Open Datasets No The paper mentions 'a collection of news articles from the New York Times' and 'stock market analysis' but does not provide specific links, DOIs, repositories, or formal citations for public access to these datasets.
Dataset Splits No The paper mentions '100k training documents with 25k left out for perplexity evaluation' but does not explicitly state a validation split, its percentage, or exact sample counts for a dedicated validation set.
Hardware Specification No The paper refers to training times and the use of Stan for HMC but does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for experiments.
Software Dependencies No The paper mentions 'Stan (Carpenter et al., 2017)' and 'R' but does not provide specific version numbers for software dependencies or libraries used in the experimental setup.
Experiment Setup Yes The hyperparameter settings are D = 500, K = 10, α = 2 (for LDA vocabulary size D = 2000). To study the role of geometry of the DSN we rescale extreme points towards their mean by uniform random factors ck Unif(cmin, 1) for k = 1, . . . , K and vary cmin in Fig. 3