On the Independence Assumption in Neurosymbolic Learning

Authors: Emile Van Krieken, Pasquale Minervini, Edoardo Ponti, Antonio Vergari

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To visualise what the possible distributions found by minimising the semantic loss look like, we compare independent distributions and expressive distributions on the traffic light problem in Figure 3. We modelled independent distributions with two real-valued parameters and a sigmoid, and expressive distributions with 4 real-valued parameters and a softmax. Then, we minimise the semantic loss to ensure p(r, g) = 0. We use gradient descent for 10,000 iterations with a learning rate of 0.1.
Researcher Affiliation Academia 1School of Informatics, University of Edinburgh.
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No The paper does not provide any explicit statement about releasing open-source code or a link to a code repository.
Open Datasets Yes Example 2.1 (Learning with algorithms). MNIST Addition is a popular benchmark task in neurosymbolic learning (Manhaeve et al., 2021).
Dataset Splits No The paper mentions using a 'labelled dataset' and an 'unlabelled dataset' but does not specify explicit train/validation/test splits, percentages, or sample counts.
Hardware Specification No The paper does not specify the hardware used for running the experiments or visualizations (e.g., CPU, GPU models, memory).
Software Dependencies No The paper mentions using 'gradient descent' and refers to neural networks, but it does not specify any software names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x).
Experiment Setup Yes We use gradient descent for 10,000 iterations with a learning rate of 0.1. ... we use the loss function Lα(θ) = (1 α)L(θ) αH(pθ|φ), where we compute H(pθ|φ) = 1/3(pθ( r, g) + pθ(r, g) + pθ( r, g)). We plot the results in Figure 8 for the independent model and various values of α.