No Training Required: Exploring Random Encoders for Sentence Classification

Authors: John Wieting, Douwe Kiela

ICLR 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We explore various methods for computing sentence representations from pretrained word embeddings without any training, i.e., using nothing but random parameterizations. In our experiments, we evaluate on a standard sentence representation benchmark using Sent Eval (Conneau & Kiela, 2018).
Researcher Affiliation Collaboration John Wieting Carnegie Mellon University jwieting@cs.cmu.edu Douwe Kiela Facebook AI Research dkiela@fb.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes Code available at https://github.com/facebookresearch/randsent.
Open Datasets Yes We use the publicly available 300-dimensional Glo Ve embeddings (Pennington et al., 2014) trained on Common Crawl for all experiments. The set of downstream tasks we use for evaluation comprises sentiment analysis (MR, SST), question-type (TREC), product reviews (CR), subjectivity (SUBJ), opinion polarity (MPQA), paraphrasing (MRPC), entailment (SICK-E, SNLI) and semantic relatedness (SICK-R, STSB). The probing tasks consist of those in Conneau et al. (2018).
Dataset Splits Yes We compute the average accuracy/Pearson s r, along with the standard deviation, over 5 different seeds for the random methods, and tune on validation for each task. Training is stopped when validation performance has not increased 5 times. Checks for validation performance occur every 4 epochs.
Hardware Specification No The paper mentions 'fit things onto a modern GPU' but does not provide specific details about the hardware used for experiments, such as GPU/CPU models, processors, or memory.
Software Dependencies No The paper mentions using 'Sent Eval' and 'Adam' for optimization but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes We use the default Sent Eval settings, which are to train with a logistic regression classifier, use a batch size of 64, a maximum number of epochs of 200 with early stopping, no dropout, and use Adam (Kingma & Ba, 2014) for optimization with a learning rate of 0.001. For the ESNs, we only tune whether to use a Re LU or no activation function, the spectral radius from {0.4, 0.6, 0.8, 1.0}, the range of the uniform distribution for initializing W i where the max distance from zero is selected from {0.01, 0.05, 0.1, 0.2}, and finally the fraction of elements in W h that are set to 0, i.e., sparsity, is selected from {0, 0.25, 0.5, 0.75}.