Learning Word Representations with Hierarchical Sparse Coding
Authors: Dani Yogatama, Manaal Faruqui, Chris Dyer, Noah Smith
ICML 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on various benchmark tasks word similarity ranking, syntactic and semantic analogies, sentence completion, and sentiment analysis demonstrate that the method outperforms or is competitive with state-of-the-art methods. |
| Researcher Affiliation | Academia | Dani Yogatama DYOGATAMA@CS.CMU.EDU Manaal Faruqui MFARUQUI@CS.CMU.EDU Chris Dyer CDYER@CS.CMU.EDU Noah A. Smith NASMITH@CS.CMU.EDU Language Technologies Institute, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA |
| Pseudocode | Yes | Algorithm 1 Fast algorithm for learning word representations with the forest regularizer. |
| Open Source Code | No | Our word representations are available at: http://www.ark.cs.cmu.edu/dyogatam/wordvecs/. This link provides the learned word representations (data), not the source code for the methodology described in the paper. |
| Open Datasets | Yes | We use the WMT-2011 English news corpus as our training data.5 The corpus contains about 15 million sentences and 370 million words. The size of our vocabulary is 180,834. |
| Dataset Splits | Yes | We use the movie reviews dataset from Socher et al. (2013). The dataset consists of 6,920 sentences for training, 872 sentences for development, and 1,821 sentences for testing. |
| Hardware Specification | No | The paper mentions '640 cores' for parts of the training and 'computing resources provided by Google and the Pittsburgh Supercomputing Center', but does not provide specific hardware details like CPU/GPU models, memory, or exact cloud instance specifications. |
| Software Dependencies | No | The paper mentions using implementations for baseline methods like 'http://rnnlm.org/' and 'https://code.google.com/p/word2vec/', and 'http://nlp.stanford.edu/projects/glove/', but does not specify version numbers for these or any other software dependencies crucial for replication. |
| Experiment Setup | Yes | In our experiments, we use forests similar to those in Figure 1 to organize the latent word space. We choose to evaluate performance with M = 52 (4 trees) and M = 520 (40 trees). We set λ = 0.1. |