Scalable Coupling of Deep Learning with Logical Reasoning

Authors: Marianne Defresne, Sophie Barbe, Thomas Schiex

IJCAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically show it is able to efficiently learn how to solve NP-hard reasoning problems from natural inputs as the symbolic, visual or many-solutions Sudoku problems as well as the energy optimization formulation of the protein design problem, providing data efficiency, interpretability, and a posteriori control over predictions. We test our architecture on logical (feasibility) problems with one or many solutions [Nandwani et al., 2021], ω being purely symbolic or containing images. We also apply it to a real, purely data-defined, discrete optimization problem to check the ability of the E-NPLL to estimate a criteria.
Researcher Affiliation Academia Marianne Defresne1,2 , Sophie Barbe2 and Thomas Schiex1 1Universit e F ed erale de Toulouse, ANITI, INRAE, UR 875, 31326 Toulouse, France 2TBI, Universit e de Toulouse, CNRS, INRAE, INSA, ANITI, 31077 Toulouse, France
Pseudocode No The paper describes algorithmic steps in prose but does not include formal pseudocode or algorithm blocks.
Open Source Code Yes Our code is written in Python using Py Torch version 11.10.2 and Py Toulbar2 version 0.0.0.2. Code and data are available at https://forgemia.inra.fr/marianne.defresne/emmental-pll.
Open Datasets Yes We use an existing data set [Palm et al., 2018], composed of single-solution grids with 17 to 34 hints. Our data set is obtained from the symbolic Sudoku data set by replacing hints with corresponding MNIST images, as in [Brouard et al., 2020]. For training, we use the data set of [Ingraham et al., 2019], already split into train/validation/test sets of respectively 17, 000, 600 and 1, 200 proteins, in such a way that proteins with similar structures or sequences are in the same set.
Dataset Splits Yes We use 1, 000 grids for training, and 256 for validation (all hardness). We use 1, 000, 64 and 256 grids of the data set from [Nandwani et al., 2021] respectively for training, validating, and testing.
Hardware Specification Yes Unless specified otherwise, all experiments use a Nvidia RTX-6000 with 24GB of VRAM and a 2.2 GHz CPU with 128 GB of RAM.
Software Dependencies Yes Our code is written in Python using Py Torch version 11.10.2 and Py Toulbar2 version 0.0.0.2.
Experiment Setup Yes We use the Adam optimizer with a weight decay of 10 4 and a learning rate of 10 3 (other parameters take default values). An L1 regularization with multiplier 2.10 4 is applied on the cost matrices N(ω)[i, j]. For N, we use a Multi-Layer Perceptron (MLP) with 10 hidden layers of 128 neurons and residual connections [He et al., 2016] every 2 layers.