reproducibilityindex.ai

Dirichlet Simplex Nest and Geometric Inference

Authors: Mikhail Yurochkin, Aritra Guha, Yuekai Sun, Xuanlong Nguyen

ICML 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The effectiveness of our model and the learning algorithm is demonstrated by simulations and by analyses of text and ﬁnancial data.Section 5 presents an exhaustive comparative study on simulated and real data.
Researcher Affiliation	Collaboration	1IBM Research, Cambridge 2MIT-IBM Watson AI Lab 3Department of Statistics, University of Michigan.
Pseudocode	Yes	Algorithm 1 Voronoi Latent Admixture (VLAD)
Open Source Code	Yes	1Code: https://github.com/moonfolk/VLAD
Open Datasets	No	The paper mentions 'a collection of news articles from the New York Times' and 'stock market analysis' but does not provide specific links, DOIs, repositories, or formal citations for public access to these datasets.
Dataset Splits	No	The paper mentions '100k training documents with 25k left out for perplexity evaluation' but does not explicitly state a validation split, its percentage, or exact sample counts for a dedicated validation set.
Hardware Specification	No	The paper refers to training times and the use of Stan for HMC but does not provide specific hardware details such as GPU/CPU models, memory, or cloud instance types used for experiments.
Software Dependencies	No	The paper mentions 'Stan (Carpenter et al., 2017)' and 'R' but does not provide specific version numbers for software dependencies or libraries used in the experimental setup.
Experiment Setup	Yes	The hyperparameter settings are D = 500, K = 10, α = 2 (for LDA vocabulary size D = 2000). To study the role of geometry of the DSN we rescale extreme points towards their mean by uniform random factors ck Unif(cmin, 1) for k = 1, . . . , K and vary cmin in Fig. 3