Gaussian Embedding of Linked Documents from a Pretrained Semantic Space
Authors: Antoine Gourru, Julien Velcin, Julien Jacques
IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We show that our representations outperform or match most of the recent methods in classification and link prediction on three datasets (two citation networks and a corpus of news articles) in Section 4. |
| Researcher Affiliation | Academia | Antoine Gourru1 , Julien Velcin1 and Julien Jacques1 1Universit e de Lyon, Lyon 2, ERIC UR3083 {antoine.gourru, julien.velcin, julien.jacques}@univ-lyon2.fr |
| Pseudocode | Yes | Algorithm 1 GELD Algorithm Input: D, U Parameters: η, λ, k Output: µ, σ2 |
| Open Source Code | Yes | We provide the implementation of GELD and the evaluation datasets to the community (https://github.com/Antoine Gourru/DNEmbedding). |
| Open Datasets | Yes | Cora [Tu et al., 2017] and Dblp [Tang et al., 2008; Pan et al., 2016] are two citation networks. Additionally, we use the Nyt dataset from [Gourru et al., 2020] containing press articles from January 2007. |
| Dataset Splits | Yes | Cora Dblp Nyt Train/Test ratio 10% 50% 10% 50% 10% 50% |
| Hardware Specification | Yes | We run all the experiments in parallel with 20 physical cores (Intel R Xeon R CPU E5-2640 v4 @ 2.40GHz) and 96GB of RAM. |
| Software Dependencies | No | The paper mentions 'scikit-learn package' and 'gensim' but does not specify their version numbers. It only states 'implemented in gensim'. |
| Experiment Setup | Yes | Similarly, we report the optimal parameters for GELD obtained via grid-search on the classification task: δ = 0.1, γ = 0.2, η = 0.99 for Cora, η = 0.8 for Dblp and η = 0.95 for Nyt. To learn word vectors, we adopt Skip-gram with negative sampling [Mikolov et al., 2013] implemented in gensim3. We use window size of 15 for Cora, 10 for Nyt, 5 for DBLP (depending on documents size), and 5 negative examples for both. |