reproducibilityindex.ai

Invariance and identifiability issues for word embeddings

Authors: Rachel Carrington, Karthik Bharath, Simon Preston

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We provide a formal treatment of the above identiﬁability issue, present some numerical examples, and discuss possible resolutions.
Researcher Affiliation	Academia	Rachel Carrington Karthik Bharath Simon Preston School of Mathematical Sciences, University of Nottingham {rachel.carrington, karthik.bharath, simon.preston}@nottingham.ac.uk
Pseudocode	No	The paper does not contain any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described in this paper. It references existing models/corpora like GloVe and word2vec, but not its own implementation code.
Open Datasets	Yes	The embedding is from model (3), with X taken to be a document term matrix computed from the Corpus of Historical American English [Davies, 2012]... V is a Glo Ve embedding1 with d = 300 trained on Wikipedia 2014 + Gigaword 5 corpus... word2vec embeddings trained on the 100-billion word Google News corpus
Dataset Splits	No	The paper mentions using specific test sets but does not provide details on training, validation, or test splits (e.g., percentages or sample counts) for the datasets it uses.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions "R s optim implementation of the Nelder Mead method" but does not specify version numbers for R or the `optim` package.
Experiment Setup	No	The paper describes using the Nelder Mead method for optimization and specifies embedding dimensions (e.g., d=300). However, it does not provide concrete hyperparameters (e.g., learning rate, batch size, number of epochs) or specific system-level training settings for its experiments.