Visual Relation Detection using Hybrid Analogical Learning
Authors: Kezhen Chen, Ken Forbus801-808
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments on the Visual Relation Detection dataset indicates that our hybrid system gets comparable results on the task and is more training-efficient and explainable than pure deep-learning models. |
| Researcher Affiliation | Academia | Kezhen Chen and Ken Forbus Northwestern University, Evanston, IL kzchen@u.northwestern.edu, forbus@northwestern.edu |
| Pseudocode | No | The paper describes the system architecture and processes verbally and with a diagram (Figure 1), but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper provides links to third-party tools (Cog Sketch, Next KB, SME) used in their methodology, but does not provide access to the source code for their specific hybrid system. |
| Open Datasets | Yes | We evaluate our hybrid system on the Visual Relationship Dataset (VRD) (Lu et al., 2016). |
| Dataset Splits | No | We follow the popular train/test split, using 4,000 images for training and the other 1,000 images for testing. |
| Hardware Specification | No | Also, analogical learning does not require to use expensive hardware resources such as GPUs, but all deep learning baselines need to use GPUs to speed up the training process. |
| Software Dependencies | No | We use Faster-RCNN (Ren et al., 2015) with VGG16 backbone to detect the object bounding boxes and categories. To detect object masks, we utilize Mask-RCNN model (He et al., 2018) with Resnet-50 as the backbone for instance segmentation. ... We use the pre-trained model from https://github.com/facebookresearch/detectron2. ... Our system uses a two-stage pipeline of object detection followed by pairwise relation detection. ... We use the off-the-shelf Cog Sketch system (Forbus et al. 2011) to help compute spatial information and relational predicates. Cog Sketch uses Next KB, an off-the-shelf open-source broad coverage knowledge base. ... Analogical matching is handled by the structure mapping engine (SME) (Forbus et al., 2017) for analogical matching, analogical retrieval by MAC/FAC (Forbus et al., 1995), and generalization is performed by the Sequential Analogical Generalization Engine (SAGE) (Mc Lure et al., 2015). |
| Experiment Setup | Yes | In SAGE, we use 0.8 for the assimilation threshold and 0.2 for the cutoff threshold (which eliminates low-probability facts from a generalization). ... If the Io U is larger than 0.7, the corresponding instance segments from Mask-RCNN are assigned to the object. |