Context Selection for Embedding Models

Authors: Liping Liu, Francisco Ruiz, Susan Athey, David Blei

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We run a comprehensive experimental study on three datasets, namely, Movie Lens for movie recommendations, e Bird-PA for bird watching events, and grocery data for shopping behavior. We found that CS-EFE consistently outperforms EFE in terms of held-out predictive performance on the three datasets.
Researcher Affiliation Academia Li-Ping Liu Tufts University Francisco J. R. Ruiz Columbia University University of Cambridge Susan Athey Stanford University David M. Blei Columbia University
Pseudocode No No pseudocode or algorithm block was found in the paper.
Open Source Code Yes The code is in the github repo: https://github.com/blei-lab/context-selection-embedding
Open Datasets Yes Movie Lens: We consider the Movie Lens-100K dataset (Harper and Konstan, 2015)... e Bird-PA: The e Bird data (Munson et al., 2015; Sullivan et al., 2009) contains information about a set of bird observation events.
Dataset Splits Yes We set aside 9% of the data for validation and 10% for test. (Movie Lens) We split the data into train (67%), test (26%), and validation (7%) sets. (eBird-PA) We split the data into training (86%), test (5%), and validation (9%) sets. (Market-Basket).
Hardware Specification Yes We also acknowledge the support of NVIDIA Corporation with the donation of two GPUs used for this research.
Software Dependencies No We use stochastic gradient descent to maximize the objective function, adaptively setting the stepsize with Adam (Kingma and Ba, 2015). (No specific software versions provided for reproducibility.)
Experiment Setup Yes We explore different values for the dimensionality K of the embedding vectors. ... We use negative sampling (Rudolph et al., 2016) with a ratio of 1/10 of positive (non-zero) versus negative samples. We use stochastic gradient descent to maximize the objective function, adaptively setting the stepsize with Adam (Kingma and Ba, 2015)... We consider unit-variance ℓ2-regularization, and the weight of the regularization term is fixed to 1.0. ... In the context selection for exponential family embeddings (CS-EFE) model, we set the number of hidden units to 30 and 15 for each of the hidden layers, and we consider 40 bins to form the histogram.