Abductive Learning with Ground Knowledge Base

Authors: Le-Wen Cai, Wang-Zhou Dai, Yu-Xuan Huang, Yu-Feng Li, Stephen Muggleton, Yuan Jiang

IJCAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental This section discusses why GABL can improve the machine learning model performance by leveraging unlabeled data and ground knowledge base. Firstly, we illustrate the mechanism of GABL through an intuitive example. Secondly, We construct experiments and aim to address: 1) how model accuracy would impact abduction learning when given domain knowledge; 2) How domain knowledge affects abductive learning. This section describes an experiment that applies GABL to a handwritten Optical Character Recognition (OCR) task.
Researcher Affiliation Academia 1National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China 2Department of Computing, Imperial College London, London SW7 2AZ, UK
Pseudocode Yes Algorithm 1 Grounded Abductive Learning
Open Source Code Yes The code is available for download1. 1https://github.com/Abductive Learning/GABL
Open Datasets Yes When basic data are MNIST images [Le Cun et al., 1995], CIFAR-10 images [Krizhevsky, 2009]... We use IAM-database [Marti and Bunke, 2002] as the test benchmark.
Dataset Splits No The paper states, "We leave 10% of the data for testing and randomly pick out different number data as labeled data." It specifies a test split but does not explicitly mention a validation split or its percentage/methodology.
Hardware Specification Yes The experiments are run on a single V100S GPU.
Software Dependencies No The paper mentions using CRNN as the basic machine learning model, but does not provide specific version numbers for any software dependencies like Python, PyTorch, or other libraries.
Experiment Setup Yes We use CRNN [Shi et al., 2017] as the basic machine learning model. During the prediction, the CRNN greedy selects the highest probability letters of each position and then merges the repeating letters. At last, we pick the abduced pseudo-labels ranked by the CTC loss [Graves et al., 2006]. Additionally, we set k = 3 in KNN and let each leaf node of the decision tree at least three samples in training.