PAC-Bayesian Generalization Bounds for Knowledge Graph Representation Learning

Authors: Jaejun Lee, Minsung Hwang, Joyce Jiyoung Whang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show that the critical factors in our generalization bounds can explain actual generalization errors on three real-world knowledge graphs. We conduct experiments on three real-world knowledge graphs: FB15K237 [47], Co DEx-M [40], and UMLS43 [7; 27].
Researcher Affiliation Academia Jaejun Lee 1 Minsung Hwang 1 Joyce Jiyoung Whang 1 School of Computing, KAIST, Daejeon, South Korea. Correspondence to: Joyce Jiyoung Whang <jjwhang@kaist.ac.kr>.
Pseudocode No The paper describes the RAMP encoder and triplet classification decoder components mathematically and in prose, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes Our code and data are available at https://github. com/bdi-lab/Re ED
Open Datasets Yes We conduct experiments on three real-world knowledge graphs: FB15K237 [47], Co DEx-M [40], and UMLS43 [7; 27].
Dataset Splits No On all datasets, we create b E randomly sampled from E without replacement with the sampling probability of 0.8.
Hardware Specification Yes We run all our experiments using NVIDIA Ge Force RTX 2080 Ti.
Software Dependencies Yes When implementing Re ED, we used python 3.8 and Py Torch 1.12.1 with cudatoolkit 11.3.
Experiment Setup Yes In the RAMP encoder, we use ρ = ψ = identity and ϕ = Leaky Re LU. We use d1 = 96 for FB15K237, d1 = 64 for Co DEx-M, and d1 = 48 for UMLS-43. For RAMP+TD, we set the learning rate to be 0.0003 on FB15K237, 0.0005 for Co DEx-M, and 0.0002 for UMLS-43. For RAMP+SM, we set the learning rate to be 0.0005 for all datasets. We set the margin of the margin loss for FB15K237 and Co DEx-M to be 0.5, and 0.75 for UMLS-43 and run all models for 2,000 epochs.