reproducibilityindex.ai

Exponential Family Embeddings

Authors: Maja Rudolph, Francisco Ruiz, Stephan Mandt, David Blei

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We study our methods on three diﬀerent types of data neuroscience data, shopping data, and movie ratings data. Mirroring the success of word embeddings, ef-emb models outperform traditional dimension reduction, such as exponential family principal component analysis (pca) (Collins et al., 2001) and Poisson factorization (Gopalan et al., 2015), and ﬁnd interpretable features of the data. Section 3: Empirical Study.
Researcher Affiliation	Academia	Maja Rudolph Columbia University Francisco J. R. Ruiz Univ. of Cambridge Columbia University Stephan Mandt Columbia University David M. Blei Columbia University
Pseudocode	No	The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information (e.g., a link or an explicit statement) regarding the availability of source code for the methodology described.
Open Datasets	Yes	Data. We analyze the neural activity of a larval zebraﬁsh, recorded at single cell resolution for 3000 time frames (Ahrens et al., 2013). Market basket data. We analyze the IRI dataset3 (Bronnenberg et al., 2008)... Movie Lens data. We also analyze the Movie Lens-100K dataset (Harper and Konstan, 2015)...
Dataset Splits	Yes	We train each model on a random sample of 90% of the lagged time frames and hold out 5% each for validation and testing. For each K we select the Adagrad constant based on best predictive performance on the validation set. In the Movie Lens data we hold out 20% of the ratings and set aside an additional 5% of the non-zero entries from the test for validation.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions 'stochastic gradient descent (sgd) with Adagrad (Duchi et al., 2011)', but does not provide specific version numbers for any software dependencies or libraries.
Experiment Setup	Yes	We study context sizes c 2 f10; 50g and latent dimension K 2 f10; 100g. We ﬁt the p-emb and the ap-emb models using number of components K 2 f20; 100g. For each K we select the Adagrad constant based on best predictive performance on the validation set. (The parameters we used are in Table 5.)