reproducibilityindex.ai

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

Authors: Jiaqi Mu, Pramod Viswanath

ICLR 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The postprocessing is empirically validated on a variety of lexical-level intrinsic tasks (word similarity, concept categorization, word analogy) and sentence-level tasks (semantic textural similarity and text classiﬁcation) on multiple datasets and with a variety of representation methods and hyperparameter choices in multiple languages; in each case, the processed representations are consistently better than the original ones.
Researcher Affiliation	Academia	Jiaqi Mu, Pramod Viswanath University of Illinois at Urbana Champaign {jiaqimu2, pramodv}@illinois.edu
Pseudocode	Yes	Algorithm 1: Postprocessing algorithm on word representations. Input :Word representations {v(w), w V}, a threshold parameter D, 1 Compute the mean of {v(w), w V}, µ 1 \|V\| P w V v(w), v(w) v(w) µ 2 Compute the PCA components: u1, ..., ud PCA({ v(w), w V}). 3 Preprocess the representations: v (w) v(w) PD i=1 u i v(w) ui Output :Processed representations v (w).
Open Source Code	No	The paper provides links to third-party word representations and a third-party CNN text classification implementation, but does not state that the code for their proposed postprocessing methodology is open-source or publicly available.
Open Datasets	Yes	For this experiment, we use seven standard datasets: the ﬁrst published RG65 dataset (Rubenstein & Goodenough, 1965); the widely used Word Sim-353 (WS) dataset (Finkelstein et al., 2001)...
Dataset Splits	Yes	In TREC, SST and IMDb, the datasets have already been split into train/test sets. Otherwise we use 10-fold cross validation in the remaining datasets (i.e., MR and SUBJ). Detailed statistics of various features of each of the datasets are provided in Table 21.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	The paper mentions 'implemented using tensorﬂow' but does not provide specific version numbers for TensorFlow or any other software libraries used.
Experiment Setup	No	While the paper specifies the hyperparameter 'D' for its postprocessing (e.g., 'We choose D = 3 for WORD2VEC and D = 2 for GLOVE' and 'D to vary between 0 and 4'), it lacks other crucial experimental setup details such as learning rates, batch sizes, number of epochs, or specific optimizer settings for the neural network models used.