Exponential Family Embeddings
Authors: Maja Rudolph, Francisco Ruiz, Stephan Mandt, David Blei
NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We study our methods on three different types of data neuroscience data, shopping data, and movie ratings data. Mirroring the success of word embeddings, ef-emb models outperform traditional dimension reduction, such as exponential family principal component analysis (pca) (Collins et al., 2001) and Poisson factorization (Gopalan et al., 2015), and find interpretable features of the data. Section 3: Empirical Study. |
| Researcher Affiliation | Academia | Maja Rudolph Columbia University Francisco J. R. Ruiz Univ. of Cambridge Columbia University Stephan Mandt Columbia University David M. Blei Columbia University |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any concrete access information (e.g., a link or an explicit statement) regarding the availability of source code for the methodology described. |
| Open Datasets | Yes | Data. We analyze the neural activity of a larval zebrafish, recorded at single cell resolution for 3000 time frames (Ahrens et al., 2013). Market basket data. We analyze the IRI dataset3 (Bronnenberg et al., 2008)... Movie Lens data. We also analyze the Movie Lens-100K dataset (Harper and Konstan, 2015)... |
| Dataset Splits | Yes | We train each model on a random sample of 90% of the lagged time frames and hold out 5% each for validation and testing. For each K we select the Adagrad constant based on best predictive performance on the validation set. In the Movie Lens data we hold out 20% of the ratings and set aside an additional 5% of the non-zero entries from the test for validation. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., GPU/CPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions 'stochastic gradient descent (sgd) with Adagrad (Duchi et al., 2011)', but does not provide specific version numbers for any software dependencies or libraries. |
| Experiment Setup | Yes | We study context sizes c 2 f10; 50g and latent dimension K 2 f10; 100g. We fit the p-emb and the ap-emb models using number of components K 2 f20; 100g. For each K we select the Adagrad constant based on best predictive performance on the validation set. (The parameters we used are in Table 5.) |