reproducibilityindex.ai

What the Vec? Towards Probabilistically Grounded Embeddings

Authors: Carl Allen, Ivana Balazevic, Timothy Hospedales

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Here we draw on previous results and run test experiments to provide empirical support for our main theoretical results: ... Table 1: Accuracy in semantic tasks using different loss functions on the text8 corpus [24].
Researcher Affiliation	Collaboration	1 School of Informatics, University of Edinburgh, UK 2 Samsung AI Centre, Cambridge, UK
Pseudocode	No	No pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	The paper does not provide any concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described.
Open Datasets	Yes	We learn 500 dimensional embeddings from word co-occurrences extracted from a standard corpus ( text8 [24]). ... [24] Matt Mahoney. text8 wikipedia dump. http://mattmahoney.net/dc/textdata.html, 2011. [Online; accessed May 2019].
Dataset Splits	No	The paper mentions using standard corpora and popular datasets for evaluation, but does not specify explicit training/validation/test splits (e.g., percentages or sample counts) needed to reproduce the data partitioning.
Hardware Specification	No	The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments.
Software Dependencies	No	Evaluation on popular data sets [1, 25] uses the Gensim toolkit [32].
Experiment Setup	Yes	In summary, we learn 500 dimensional embeddings from word co-occurrences extracted from a standard corpus ( text8 [24]). ... In summary, we learn 500-dimensional embeddings from word co-occurrences extracted from text8 using a window size of 5 (W2V parameter). For the LSQ models, a batch size of 512 was used, with 10 epochs (early stopping). For all models, the negative sampling parameter k=5.