What the Vec? Towards Probabilistically Grounded Embeddings
Authors: Carl Allen, Ivana Balazevic, Timothy Hospedales
NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Here we draw on previous results and run test experiments to provide empirical support for our main theoretical results: ... Table 1: Accuracy in semantic tasks using different loss functions on the text8 corpus [24]. |
| Researcher Affiliation | Collaboration | 1 School of Informatics, University of Edinburgh, UK 2 Samsung AI Centre, Cambridge, UK |
| Pseudocode | No | No pseudocode or algorithm blocks were found in the paper. |
| Open Source Code | No | The paper does not provide any concrete access to source code (specific repository link, explicit code release statement, or code in supplementary materials) for the methodology described. |
| Open Datasets | Yes | We learn 500 dimensional embeddings from word co-occurrences extracted from a standard corpus ( text8 [24]). ... [24] Matt Mahoney. text8 wikipedia dump. http://mattmahoney.net/dc/textdata.html, 2011. [Online; accessed May 2019]. |
| Dataset Splits | No | The paper mentions using standard corpora and popular datasets for evaluation, but does not specify explicit training/validation/test splits (e.g., percentages or sample counts) needed to reproduce the data partitioning. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. |
| Software Dependencies | No | Evaluation on popular data sets [1, 25] uses the Gensim toolkit [32]. |
| Experiment Setup | Yes | In summary, we learn 500 dimensional embeddings from word co-occurrences extracted from a standard corpus ( text8 [24]). ... In summary, we learn 500-dimensional embeddings from word co-occurrences extracted from text8 using a window size of 5 (W2V parameter). For the LSQ models, a batch size of 512 was used, with 10 epochs (early stopping). For all models, the negative sampling parameter k=5. |