Nested Subspace Arrangement for Representation of Relational Data

Authors: Nozomi Hata, Shizuo Kaji, Akihiro Yoshida, Katsuki Fujisawa

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments have shown that DANCAR has successfully embedded Word Net in R20 with an F1 score of 0.993 in the reconstruction task. DANCAR is also suitable for visualization in understanding the characteristics of graphs.
Researcher Affiliation Academia 1Graduate School of Mathematics, Kyushu University, Fukuoka, Japan 2Institute of Mathematics for Industry, Kyushu University, Fukuoka, Japan.
Pseudocode Yes Algorithm 1 Tree embedding
Open Source Code Yes Source code is publicly available at https://github.com/KyushuUniversityMathematics/DANCAR.
Open Datasets Yes As a practical target graph, we used the largest weakly-connected component of a noun closure of the Word Net (Miller, 1995).
Dataset Splits Yes For the reconstruction experiment, we compared the edge existence in the original graph and the reconstructed graph from the embedding. For the link prediction experiment, we computed the embedding using a half of edges randomly chosen from the original graph and compared the edge existence in the original graph and the reconstructed graph from the embedding. Table 2. The precision and the F1 score of the reconstruction (100% training) and the link prediction (50% training) tasks.
Hardware Specification No The paper does not specify any particular hardware (e.g., GPU, CPU model, memory) used for running the experiments. It only mentions 'in R10 and R20 as the embedding space'.
Software Dependencies Yes All experiments were implemented in Chainer 7.4.0.
Experiment Setup Yes The hyper-parameters for the DANCAR were chosen as follows. The margin parameter µ was fixed to 0.01. We tested with R10 and R20 as the embedding space. We experimented with the hyper-parameters 8 λneg 1000, λanc {1, 10} and the best results were chosen. For optimization with a stochastic gradient descent, we used two different batch sizes b1 = 10,000 for the positive loss and the vertex loss, and b2 = 100,000 for the negative loss to account for the sparsity of the graph. We randomly selected the negative samples for each iteration. We used the Adam (Kingma & Ba, 2015) optimizer with parameters α = 0.05, β1 = 0.9, and β2 = 0.999.