Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning

Authors: Jiani Huang, Ziyang Li, Binghong Chen, Karan Samel, Mayur Naik, Le Song, Xujie Si

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental On synthetic tasks involving mathematical and logical reasoning, Scallop scales significantly better without sacrificing accuracy compared to Deep Prob Log, a principled neural logic programming approach. Scallop also scales to a newly created real-world Visual Question Answering (VQA) benchmark that requires multi-hop reasoning, achieving 84.22% accuracy and outperforming two VQA-tailored models based on Neural Module Networks and transformers by 12.42% and 21.66% respectively.
Researcher Affiliation Academia Jiani Huang University of Pennsylvania jianih@seas.upenn.edu Ziyang Li University of Pennsylvania liby99@seas.upenn.edu Binghong Chen Georgia Institute of Technology binghong@gatech.edu Karan Samel Georgia Institute of Technology ksamel@gatech.edu Mayur Naik University of Pennsylvania mhnaik@seas.upenn.edu Le Song Georgia Institute of Technology lsong@cc.gatech.edu Xujie Si Mc Gill University and CIFAR AI Chair, Mila xsi@cs.mcgill.ca
Pseudocode No The paper describes algorithms and processes in narrative text and figures, but it does not include formally structured pseudocode or algorithm blocks with specific labels such as 'Algorithm' or 'Pseudocode'.
Open Source Code Yes The source code of Scallop is available at https://github.com/scallop-lang/scallop-v1.
Open Datasets Yes The images and scene graphs are from the GQA [18] dataset and the knowledge graph is from the CRIC [16] dataset. ... Each task takes as input multiple MNIST [20] images and requires performing simple arithmetic (T1-T3) or sorting (T4-T6) over digits depicted in the given images.
Dataset Splits Yes We split the images randomly into training (60%) validation (10%), and testing (30%) sets.
Hardware Specification Yes All experiments are conducted on a machine with two 20-core Intel Xeon CPUs, four Ge Force RTX 2080 Ti GPUs, and 768 GB RAM.
Software Dependencies No The paper mentions software components such as Datalog, Prolog, Sentential Decision Diagram (SDD), Mask RCNN, ResNet, and TransE, but it does not specify version numbers for these or any programming language libraries (e.g., Python, PyTorch, TensorFlow) used in the implementation.
Experiment Setup Yes Scallop takes 92 hours to finish 15 training epochs with k = 10 and takes only 0.3 seconds on average per training sample. ... In our experimental setup, we apply the binary cross entropy loss function on the two vectors.