Structured Embedding Models for Grouped Data

Authors: Maja Rudolph, Francisco Ruiz, Susan Athey, David Blei

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our quantitative results show that sharing the context vectors provides better results, and that amortization and hierarchical structure give further improvements. Data. We apply the S-EFE on three datasets: Ar Xiv papers, U.S. Senate speeches, and purchases on supermarket grocery shopping data.
Researcher Affiliation Academia Maja Rudolph Columbia Univ. maja@cs.columbia.edu Francisco Ruiz Univ. of Cambridge Columbia Univ. Susan Athey Stanford Univ. David Blei Columbia Univ.
Pseudocode No The paper describes the mathematical formulations and steps of the S-EFE model but does not present them in a structured pseudocode block or algorithm listing.
Open Source Code Yes Code is available at https://github.com/mariru/structured_embeddings
Open Datasets No Ar Xiv papers: This dataset contains the abstracts of papers published on the Ar Xiv under the 19 different tags between April 2007 and June 2015. Senate speeches: This dataset contains U.S. Senate speeches from 1994 to mid 2009. Grocery shopping data: This dataset contains the purchases of 3, 206 customers. The data covers a period of 97 weeks. The paper mentions these datasets but does not provide specific links, DOIs, or citations with author/year for public access to the versions used.
Dataset Splits Yes We split the abstracts into training, validation, and test sets, with proportions of 80%, 10%, and 10%, respectively. We split the speeches into training (80%), validation (10%), and testing (10%). We split the data into a training, test, and validation sets, with proportions of 90%, 5%, and 5%, respectively.
Hardware Specification No The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models.
Software Dependencies No We implement the hierarchical and amortized S-EFE models in Tensor Flow (Abadi et al., 2015), which allows us to leverage automatic differentiation. The paper mentions TensorFlow but does not provide a specific version number.
Experiment Setup Yes For text we set the dimension of the embeddings to K = 100, the number of hidden units to H = 25, and we experiment with two context sizes, 2 and 8. In the shopping data, we use K = 50 and H = 20, and we randomly truncate the context of baskets larger than 20 to reduce their size to 20. For both methods, we use 20 negative samples. For text, we use a minibatch size of N/10000, where N is the size of the corpus, and we run 5 passes over the data; for the shopping data we use N/100 and run 50 passes. We use the default learning rate setting of Tensor Flow for Adam (Kingma and Ba, 2015). The weights are drawn from a uniform distribution bounded at K + H (Glorot and Bengio, 2010). Finally, for each method we choose the regularization variance from the set {100, 10, 1, 0.1}, also based on validation error.