SPINE: SParse Interpretable Neural Embeddings

Authors: Anant Subramanian, Danish Pruthi, Harsh Jhamtani, Taylor Berg-Kirkpatrick, Eduard Hovy

AAAI 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Through large scale human evaluation, we report that our resulting word embedddings are much more interpretable than the original Glo Ve and word2vec embeddings. Moreover, our embeddings outperform existing popular word embeddings on a diverse suite of benchmark downstream tasks.
Researcher Affiliation Academia Anant Subramanian,* Danish Pruthi,* Harsh Jhamtani,* Taylor Berg-Kirkpatrick, Eduard Hovy School of Computer Science Carnegie Mellon University, Pittsburgh, USA {anant,danish,jharsh,tberg,hovy}@cmu.edu
Pseudocode No The paper provides mathematical formulations and descriptions of the model but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes Our code and generated word vectors are publicly available at https://github.com/harsh19/SPINE
Open Datasets Yes We train autoencoder models on pre-trained Glo Ve and word2vec embeddings. The Glo Ve vectors were trained on 6 billion tokens from a 2014 dump of Wikipedia and Gigaword5, while the word2vec vectors were trained on around 100 billion words from a part of the Google News dataset. ... Sentiment Analysis: This task tests the semantic properties of word embeddings. It is a sentence-level binary classification task on the Stanford Sentiment Treebank dataset (Socher et al. 2013). ... Question Classification (TREC): To facilitate research in question answering, (Li and Roth 2006) propose a dataset of categorizing questions into six different types, e.g., whether the question is about a location, about a person, or about some numeric information. The TREC dataset comprises of 5,452 labeled training questions, and 500 test questions.
Dataset Splits Yes We use 15k of these words for training, and use the remaining 2k for hyperparameter tuning. ... Sentiment Analysis: ... We used the provided train, dev. and test splits with only the non-neutral labels, of sizes 8337, 1081 and 2166 sentences respectively. ... Question Classification (TREC): ... By isolating 10% of the training questions for validation, we use train/validation/test splits of 4906/546/500 questions respectively.
Hardware Specification No The paper does not provide any specific hardware details such as GPU/CPU models, memory, or cloud computing instance types used for experiments.
Software Dependencies No The paper mentions using SVMs, Logistic Regression, and Random forests, but does not specify version numbers for any software or libraries used in the experiments.
Experiment Setup Yes Table 3: Grid-search was performed to select values of the following hyperparamters: Sparsity fraction (ρ ), hiddendimension size (|H|), standard deviation of the additive isotropic zero-mean Gaussian noise (σ), and the coefficients for the ASL and PSL loss terms (λ1 and λ2). ... We observed that a hidden layer of size 1000 units is optimal for our case.