Structured Embedding Models for Grouped Data
Authors: Maja Rudolph, Francisco Ruiz, Susan Athey, David Blei
NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our quantitative results show that sharing the context vectors provides better results, and that amortization and hierarchical structure give further improvements. Data. We apply the S-EFE on three datasets: Ar Xiv papers, U.S. Senate speeches, and purchases on supermarket grocery shopping data. |
| Researcher Affiliation | Academia | Maja Rudolph Columbia Univ. maja@cs.columbia.edu Francisco Ruiz Univ. of Cambridge Columbia Univ. Susan Athey Stanford Univ. David Blei Columbia Univ. |
| Pseudocode | No | The paper describes the mathematical formulations and steps of the S-EFE model but does not present them in a structured pseudocode block or algorithm listing. |
| Open Source Code | Yes | Code is available at https://github.com/mariru/structured_embeddings |
| Open Datasets | No | Ar Xiv papers: This dataset contains the abstracts of papers published on the Ar Xiv under the 19 different tags between April 2007 and June 2015. Senate speeches: This dataset contains U.S. Senate speeches from 1994 to mid 2009. Grocery shopping data: This dataset contains the purchases of 3, 206 customers. The data covers a period of 97 weeks. The paper mentions these datasets but does not provide specific links, DOIs, or citations with author/year for public access to the versions used. |
| Dataset Splits | Yes | We split the abstracts into training, validation, and test sets, with proportions of 80%, 10%, and 10%, respectively. We split the speeches into training (80%), validation (10%), and testing (10%). We split the data into a training, test, and validation sets, with proportions of 90%, 5%, and 5%, respectively. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU or CPU models. |
| Software Dependencies | No | We implement the hierarchical and amortized S-EFE models in Tensor Flow (Abadi et al., 2015), which allows us to leverage automatic differentiation. The paper mentions TensorFlow but does not provide a specific version number. |
| Experiment Setup | Yes | For text we set the dimension of the embeddings to K = 100, the number of hidden units to H = 25, and we experiment with two context sizes, 2 and 8. In the shopping data, we use K = 50 and H = 20, and we randomly truncate the context of baskets larger than 20 to reduce their size to 20. For both methods, we use 20 negative samples. For text, we use a minibatch size of N/10000, where N is the size of the corpus, and we run 5 passes over the data; for the shopping data we use N/100 and run 50 passes. We use the default learning rate setting of Tensor Flow for Adam (Kingma and Ba, 2015). The weights are drawn from a uniform distribution bounded at K + H (Glorot and Bengio, 2010). Finally, for each method we choose the regularization variance from the set {100, 10, 1, 0.1}, also based on validation error. |