reproducibilityindex.ai

Hash Embeddings for Efficient Word Representations

Authors: Dan Tito Svenstrup, Jonas Hansen, Ole Winther

NeurIPS 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark hash embeddings with and without dictionaries on text classiﬁcation tasks. We evaluate hash embeddings on 7 different datasets... The performance of the model when using each of the two embedding types can be seen in the left side of table 2. Our experiments show that the performance of hash embeddings is always at par with using standard embeddings, and in most cases better.
Researcher Affiliation	Collaboration	Dan Svenstrup Department for Applied Mathematics and Computer Science Technical University of Denmark (DTU) 2800 Lyngby, Denmark dsve@dtu.dk Jonas Meinertz Hansen Find Zebra Copenhagen, Denmark jonas@findzebra.com Ole Winther Department for Applied Mathematics and Computer Science Technical University of Denmark (DTU) 2800 Lyngby, Denmark olwi@dtu.dk
Pseudocode	No	The paper describes the steps of hash embedding construction in text and illustrates it with a diagram (Fig. 1), but it does not contain a structured pseudocode or algorithm block.
Open Source Code	No	The paper does not provide concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described in this paper.
Open Datasets	Yes	We evaluate hash embeddings on 7 different datasets in the form introduced by Zhang et al. (2015) for various text classiﬁcation tasks including topic classiﬁcation, sentiment analysis, and news categorization. An overview of the datasets can be seen in table 1.
Dataset Splits	Yes	We use early stopping with a patience of 10, and use 5% of the training data as validation data.
Hardware Specification	Yes	The training was performed on a Nvidia Ge Force GTX TITAN X with 12 GB of memory. the small performance difference was observed when using Keras with a Tensorﬂow backend on a Ge Force GTX TITAN X with 12 GB of memory and a Nvidia Ge Force GTX 660 with 2GB memory.
Software Dependencies	No	The paper states 'All models were implemented using Keras with Tensor Flow backend' but does not provide specific version numbers for Keras, TensorFlow, or any other ancillary software components.
Experiment Setup	Yes	All the models are trained by minimizing the cross entropy using the stochastic gradient descentbased Adam method (Kingma and Ba, 2014) with a learning rate set to α = 0.001. We use early stopping with a patience of 10, and use 5% of the training data as validation data. The hash embeddings use K = 10M different importance parameter vectors, k = 2 hash functions, and B = 1M component vectors of dimension d = 20.