Context Selection for Embedding Models
Authors: Liping Liu, Francisco Ruiz, Susan Athey, David Blei
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We run a comprehensive experimental study on three datasets, namely, Movie Lens for movie recommendations, e Bird-PA for bird watching events, and grocery data for shopping behavior. We found that CS-EFE consistently outperforms EFE in terms of held-out predictive performance on the three datasets. |
| Researcher Affiliation | Academia | Li-Ping Liu Tufts University Francisco J. R. Ruiz Columbia University University of Cambridge Susan Athey Stanford University David M. Blei Columbia University |
| Pseudocode | No | No pseudocode or algorithm block was found in the paper. |
| Open Source Code | Yes | The code is in the github repo: https://github.com/blei-lab/context-selection-embedding |
| Open Datasets | Yes | Movie Lens: We consider the Movie Lens-100K dataset (Harper and Konstan, 2015)... e Bird-PA: The e Bird data (Munson et al., 2015; Sullivan et al., 2009) contains information about a set of bird observation events. |
| Dataset Splits | Yes | We set aside 9% of the data for validation and 10% for test. (Movie Lens) We split the data into train (67%), test (26%), and validation (7%) sets. (eBird-PA) We split the data into training (86%), test (5%), and validation (9%) sets. (Market-Basket). |
| Hardware Specification | Yes | We also acknowledge the support of NVIDIA Corporation with the donation of two GPUs used for this research. |
| Software Dependencies | No | We use stochastic gradient descent to maximize the objective function, adaptively setting the stepsize with Adam (Kingma and Ba, 2015). (No specific software versions provided for reproducibility.) |
| Experiment Setup | Yes | We explore different values for the dimensionality K of the embedding vectors. ... We use negative sampling (Rudolph et al., 2016) with a ratio of 1/10 of positive (non-zero) versus negative samples. We use stochastic gradient descent to maximize the objective function, adaptively setting the stepsize with Adam (Kingma and Ba, 2015)... We consider unit-variance ℓ2-regularization, and the weight of the regularization term is fixed to 1.0. ... In the context selection for exponential family embeddings (CS-EFE) model, we set the number of hidden units to 30 and 15 for each of the hidden layers, and we consider 40 bins to form the histogram. |