reproducibilityindex.ai

Gaussian Embedding of Linked Documents from a Pretrained Semantic Space

Authors: Antoine Gourru, Julien Velcin, Julien Jacques

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We show that our representations outperform or match most of the recent methods in classiﬁcation and link prediction on three datasets (two citation networks and a corpus of news articles) in Section 4.
Researcher Affiliation	Academia	Antoine Gourru1 , Julien Velcin1 and Julien Jacques1 1Universit e de Lyon, Lyon 2, ERIC UR3083 {antoine.gourru, julien.velcin, julien.jacques}@univ-lyon2.fr
Pseudocode	Yes	Algorithm 1 GELD Algorithm Input: D, U Parameters: η, λ, k Output: µ, σ2
Open Source Code	Yes	We provide the implementation of GELD and the evaluation datasets to the community (https://github.com/Antoine Gourru/DNEmbedding).
Open Datasets	Yes	Cora [Tu et al., 2017] and Dblp [Tang et al., 2008; Pan et al., 2016] are two citation networks. Additionally, we use the Nyt dataset from [Gourru et al., 2020] containing press articles from January 2007.
Dataset Splits	Yes	Cora Dblp Nyt Train/Test ratio 10% 50% 10% 50% 10% 50%
Hardware Specification	Yes	We run all the experiments in parallel with 20 physical cores (Intel R Xeon R CPU E5-2640 v4 @ 2.40GHz) and 96GB of RAM.
Software Dependencies	No	The paper mentions 'scikit-learn package' and 'gensim' but does not specify their version numbers. It only states 'implemented in gensim'.
Experiment Setup	Yes	Similarly, we report the optimal parameters for GELD obtained via grid-search on the classiﬁcation task: δ = 0.1, γ = 0.2, η = 0.99 for Cora, η = 0.8 for Dblp and η = 0.95 for Nyt. To learn word vectors, we adopt Skip-gram with negative sampling [Mikolov et al., 2013] implemented in gensim3. We use window size of 15 for Cora, 10 for Nyt, 5 for DBLP (depending on documents size), and 5 negative examples for both.