Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Training Temporal Word Embeddings with a Compass
Authors: Valerio Di Carlo, Federico Bianchi, Matteo Palmonari6326-6334
AAAI 2019 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments conducted using stateof-the-art datasets and methodologies suggest that our approach outperforms or equals comparable approaches while being more robust in terms of the required corpus size. |
| Researcher Affiliation | Collaboration | 1BUP Solutions, Rome, Italy, 2University of Milan-Bicocca, Milan, Italy |
| Pseudocode | No | The paper does not contain structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | Our experiments can be easily replicated using the source code available online1. 1https://github.com/valedica/twec |
| Open Datasets | Yes | The small dataset (Yao et al. 2018) is freely available online2. We will refer to this dataset as News Article Corpus Small (NAC-S). The big dataset is the New York Times Annotated Corpus3 (Sandhaus 2008) employed by Szymanski; Zhang et al. to test their TWEMs. 2https://sites.google.com/site/zijunyaorutgers/publications 3https://catalog.ldc.upenn.edu/ldc2008t19 |
| Dataset Splits | Yes | MLPC is made available online (Rudolph and Blei 2018) by Rudolph and Blei: the text is already preprocessed, sub-sampled (|V | = 5, 000) and split into training, validation and testing (80%, 10%, 10%); |
| Hardware Specification | No | The paper mentions "DBE takes almost 6 hours to train on NAC-S on a 16-core CPU setting." but does not provide specific CPU models, GPU models, or other detailed hardware specifications. |
| Software Dependencies | No | The paper mentions using "gensim library" and "tensorflow" but does not specify any version numbers for these software dependencies, making it difficult to reproduce the exact software environment. |
| Experiment Setup | Yes | The hyper-parameters reflect those of Yao et al.: small embeddings of size 50, a window of 5 words, 5 negative samples and a small vocabulary of 21k words with at least 200 occurrences over the entire corpus. The settings parameters are similar to those of Szymanski: longer embeddings of size 100, a window size of 5, 5 negative samples and a very large vocabulary of almost 200k words with at least 5 occurrences over the entire corpus. learning rate η = 0.0025, window of size 1, embeddings of size 50 and 10 iterations (5 static and 5 dynamic for TWEC, 1 static and 9 dynamic for DBE as suggested by Rudolph and Blei). |