Visual Relation Detection using Hybrid Analogical Learning

Authors: Kezhen Chen, Ken Forbus801-808

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on the Visual Relation Detection dataset indicates that our hybrid system gets comparable results on the task and is more training-efficient and explainable than pure deep-learning models.
Researcher Affiliation Academia Kezhen Chen and Ken Forbus Northwestern University, Evanston, IL kzchen@u.northwestern.edu, forbus@northwestern.edu
Pseudocode No The paper describes the system architecture and processes verbally and with a diagram (Figure 1), but does not include any pseudocode or algorithm blocks.
Open Source Code No The paper provides links to third-party tools (Cog Sketch, Next KB, SME) used in their methodology, but does not provide access to the source code for their specific hybrid system.
Open Datasets Yes We evaluate our hybrid system on the Visual Relationship Dataset (VRD) (Lu et al., 2016).
Dataset Splits No We follow the popular train/test split, using 4,000 images for training and the other 1,000 images for testing.
Hardware Specification No Also, analogical learning does not require to use expensive hardware resources such as GPUs, but all deep learning baselines need to use GPUs to speed up the training process.
Software Dependencies No We use Faster-RCNN (Ren et al., 2015) with VGG16 backbone to detect the object bounding boxes and categories. To detect object masks, we utilize Mask-RCNN model (He et al., 2018) with Resnet-50 as the backbone for instance segmentation. ... We use the pre-trained model from https://github.com/facebookresearch/detectron2. ... Our system uses a two-stage pipeline of object detection followed by pairwise relation detection. ... We use the off-the-shelf Cog Sketch system (Forbus et al. 2011) to help compute spatial information and relational predicates. Cog Sketch uses Next KB, an off-the-shelf open-source broad coverage knowledge base. ... Analogical matching is handled by the structure mapping engine (SME) (Forbus et al., 2017) for analogical matching, analogical retrieval by MAC/FAC (Forbus et al., 1995), and generalization is performed by the Sequential Analogical Generalization Engine (SAGE) (Mc Lure et al., 2015).
Experiment Setup Yes In SAGE, we use 0.8 for the assimilation threshold and 0.2 for the cutoff threshold (which eliminates low-probability facts from a generalization). ... If the Io U is larger than 0.7, the corresponding instance segments from Mask-RCNN are assigned to the object.