Invariance and identifiability issues for word embeddings

Authors: Rachel Carrington, Karthik Bharath, Simon Preston

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide a formal treatment of the above identifiability issue, present some numerical examples, and discuss possible resolutions.
Researcher Affiliation Academia Rachel Carrington Karthik Bharath Simon Preston School of Mathematical Sciences, University of Nottingham {rachel.carrington, karthik.bharath, simon.preston}@nottingham.ac.uk
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code No The paper does not provide concrete access to source code for the methodology described in this paper. It references existing models/corpora like GloVe and word2vec, but not its own implementation code.
Open Datasets Yes The embedding is from model (3), with X taken to be a document term matrix computed from the Corpus of Historical American English [Davies, 2012]... V is a Glo Ve embedding1 with d = 300 trained on Wikipedia 2014 + Gigaword 5 corpus... word2vec embeddings trained on the 100-billion word Google News corpus
Dataset Splits No The paper mentions using specific test sets but does not provide details on training, validation, or test splits (e.g., percentages or sample counts) for the datasets it uses.
Hardware Specification No The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies No The paper mentions "R s optim implementation of the Nelder Mead method" but does not specify version numbers for R or the `optim` package.
Experiment Setup No The paper describes using the Nelder Mead method for optimization and specifies embedding dimensions (e.g., d=300). However, it does not provide concrete hyperparameters (e.g., learning rate, batch size, number of epochs) or specific system-level training settings for its experiments.