Embedding Compression with Isotropic Iterative Quantization

Authors: Siyu Liao, Jie Chen, Yanzhi Wang, Qinru Qiu, Bo Yuan8336-8343

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We run the proposed method on pre-trained embedding vectors and evaluate the compressed embedding in various NLP tasks. For some tasks, the evaluation is directly conducted over the embedding (e.g., measuring the cosine similarity between word vectors); whereas for others, a classifier is trained with the embedding. We conduct all experiments in Python by using Numpy and Keras.
Researcher Affiliation Collaboration Siyu Liao,1 Jie Chen,2 Yanzhi Wang,3 Qinru Qiu,4 Bo Yuan1 1Department of Electrical and Computer Engineering, Rutgers University 2MIT-IBM Watson AI Lab, IBM Research 3Department of Electrical and Computer Engineering, Northeastern University 4Electrical Engineering and Computer Science Department, Syracuse University
Pseudocode Yes Algorithm 1: Isotropic Iterative Quantization
Open Source Code No The paper does not provide an explicit statement or a direct link to the open-source code for the IIQ method described in this paper. It references third-party benchmarks and toolkits used in experiments, but not its own implementation.
Open Datasets Yes We perform experiments with the Glo Ve embedding (Pennington, Socher, and Manning 2014) and the HDC embedding (Sun et al. 2015). In addition, we evaluate embedding compression on a CNN model pre-trained with the IMDB data set (Maas et al. 2011). Four data sets are selected from (Wang and Manning 2012), including movie review (MR), customer review (CR), opinion-polarity (MPQA), and subjectivity (SUBJ). Seven data sets are used, including MEN (Bruni, Tran, and Baroni 2014); MTurk (Radinsky et al. 2011); RG65 (Rubenstein and Goodenough 1965); RW (Luong, Socher, and Manning 2013); Sim Lex999 (Hill, Reichart, and Korhonen 2015); TR9856 (Levy et al. 2015); and WS353 (Agirre et al. 2009). Four data sets are used in this experiment: Almuhareb Poesio (AP) (Almuhareb and Poesio 2005); BLESS (Baroni and Lenci 2011); Battig (Battig and Montague 1969); and ESSLI2008 Workshop (M Baroni and Lenci 2008).
Dataset Splits Yes The data set contains 25,000 movie reviews for training and another 25,000 for testing. We randomly separate 5,000 reviews from the training set as validation data. Five-fold cross validation is used to report classification accuracy.
Hardware Specification Yes The environment is Ubuntu 16.04 with Intel(R) Xeon(R) CPU E5-2698.
Software Dependencies No The paper states, "We conduct all experiments in Python by using Numpy and Keras." and "The CNN model follows the Keras tutorial". However, it does not provide specific version numbers for Python, Numpy, or Keras, which are necessary for full reproducibility of software dependencies.
Experiment Setup Yes We train the DCCL method for 200 epochs and set the batch size to be 1024 for Glo Ve and 64 for HDC. For our method, we set the iteration number T to be 50 since early stopping works sufficiently well. We set the same iteration number for ITQ. We also set the parameter D to be 2 for HDC, and 14 for Glove embedding. Moreover, we use adam optimizer with learning rate 0.0001, sentence length 400, batch size 128, and train for 20 epochs.