Continual Multimodal Knowledge Graph Construction

Authors: Xiang Chen, Jingtian Zhang, Xiaohan Wang, Ningyu Zhang, Tongtong Wu, Yuxiang Wang, Yongheng Wang, Huajun Chen

IJCAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This study introduces benchmarks aimed at fostering the development of the continual MKGC domain. We further introduce MSPT framework, designed to surmount the shortcomings of existing MKGC approaches during multimedia data processing. MSPT harmonizes the retention of learned knowledge (stability) and the integration of new data (plasticity), outperforming current continual learning and multimodal methods. Our results confirm MSPT s superior performance in evolving knowledge environments, showcasing its capacity to navigate balance between stability and plasticity. ... We benchmark MSPT against the Vanilla Training approach, multimodal KGC models such as MEGA and MKGformer, as well as the continual RE method RP-CRE.
Researcher Affiliation Collaboration Xiang Chen1,3 , Jingtian Zhang2,3 , Xiaohan Wang2,3 , Ningyu Zhang2,3 , Tongtong Wu4 , Yuxiang Wang5 , Yongheng Wang6 and Huajun Chen1,3 1College of Computer Science and Technology, Zhejiang University 2School of Software Technology, Zhejiang University 3Zhejiang University Ant Group Joint Research Center for Knowledge Graphs 4Monash University, 5Hangzhou Dianzi University, 6Zhejiang Lab
Pseudocode No The paper describes the methodology using text and diagrams, but does not include explicit pseudocode or algorithm blocks.
Open Source Code Yes 1Our data and code are available at https://github.com/zjunlp/Continue MKGC
Open Datasets Yes 1Our data and code are available at https://github.com/zjunlp/Continue MKGC. ... The k-th task Tk includes a distinct set of entity types Ek and relations Rk, along with an MKGC corpus Ck which is divided into training, validation, and testing subsets Dk, Vk, and Qk, respectively.
Dataset Splits Yes The k-th task Tk includes a distinct set of entity types Ek and relations Rk, along with an MKGC corpus Ck which is divided into training, validation, and testing subsets Dk, Vk, and Qk, respectively.
Hardware Specification No The paper does not provide specific details about the hardware used for experiments (e.g., CPU, GPU models, memory).
Software Dependencies No The paper mentions using BERT and ViT models, and SGD algorithm, but does not specify software versions (e.g., Python, PyTorch, TensorFlow versions).
Experiment Setup Yes As illustrated in Figure 2, our continual KGC framework adopts a dual-stream Transformer structure with the taskspecific paradigm, including: (1) Structure. We incorporate a Visual Transformer (Vi T) [Dosovitskiy et al., 2021] for visual data and BERT for textual data. ... For the MRE task, we employ a task-specific approach by fusing the [CLS] token representations... In the context of MNER, for fair benchmarking against prior work, we employ a CRF function... We propose a gradient modulation strategy to fine-tune the optimization of visual and textual encoders, depicted in Figure 2(b). Building upon concepts, we adapt these to the k-th task using the Stochastic Gradient Descent (SGD) algorithm... This is based on quantifying their respective contributions to the learning goal via the contribution ratio γn: ... We further propose to balance the multimodal learning rhythm by integrating the coefficient gt n into the SGD optimization process of task k in iteration n as follows: ... This setup permits the development of new attention patterns during the k-th task without penalties, while attention absent in the current but present in the (k 1)-th task is penalized, promoting targeted knowledge retention. 4.4 Training Objective Our model leverages a cross-entropy loss (LCE) to effectively recognize entities and relations, while an attention distillation loss (LAD) mitigates the issue of catastrophic forgetting. We formulate the combined loss function as: Lall = λLAD + LCE. ... Additionally, we adopt the rehearsal strategy from PR-CRE to retain a concise memory set merely six examples per task for continual learning alignment, and optimizing memory footprint.