reproducibilityindex.ai

Scallop: From Probabilistic Deductive Databases to Scalable Differentiable Reasoning

Authors: Jiani Huang, Ziyang Li, Binghong Chen, Karan Samel, Mayur Naik, Le Song, Xujie Si

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	On synthetic tasks involving mathematical and logical reasoning, Scallop scales signiﬁcantly better without sacriﬁcing accuracy compared to Deep Prob Log, a principled neural logic programming approach. Scallop also scales to a newly created real-world Visual Question Answering (VQA) benchmark that requires multi-hop reasoning, achieving 84.22% accuracy and outperforming two VQA-tailored models based on Neural Module Networks and transformers by 12.42% and 21.66% respectively.
Researcher Affiliation	Academia	Jiani Huang University of Pennsylvania jianih@seas.upenn.edu Ziyang Li University of Pennsylvania liby99@seas.upenn.edu Binghong Chen Georgia Institute of Technology binghong@gatech.edu Karan Samel Georgia Institute of Technology ksamel@gatech.edu Mayur Naik University of Pennsylvania mhnaik@seas.upenn.edu Le Song Georgia Institute of Technology lsong@cc.gatech.edu Xujie Si Mc Gill University and CIFAR AI Chair, Mila xsi@cs.mcgill.ca
Pseudocode	No	The paper describes algorithms and processes in narrative text and figures, but it does not include formally structured pseudocode or algorithm blocks with specific labels such as 'Algorithm' or 'Pseudocode'.
Open Source Code	Yes	The source code of Scallop is available at https://github.com/scallop-lang/scallop-v1.
Open Datasets	Yes	The images and scene graphs are from the GQA [18] dataset and the knowledge graph is from the CRIC [16] dataset. ... Each task takes as input multiple MNIST [20] images and requires performing simple arithmetic (T1-T3) or sorting (T4-T6) over digits depicted in the given images.
Dataset Splits	Yes	We split the images randomly into training (60%) validation (10%), and testing (30%) sets.
Hardware Specification	Yes	All experiments are conducted on a machine with two 20-core Intel Xeon CPUs, four Ge Force RTX 2080 Ti GPUs, and 768 GB RAM.
Software Dependencies	No	The paper mentions software components such as Datalog, Prolog, Sentential Decision Diagram (SDD), Mask RCNN, ResNet, and TransE, but it does not specify version numbers for these or any programming language libraries (e.g., Python, PyTorch, TensorFlow) used in the implementation.
Experiment Setup	Yes	Scallop takes 92 hours to ﬁnish 15 training epochs with k = 10 and takes only 0.3 seconds on average per training sample. ... In our experimental setup, we apply the binary cross entropy loss function on the two vectors.