On the Independence Assumption in Neurosymbolic Learning
Authors: Emile Van Krieken, Pasquale Minervini, Edoardo Ponti, Antonio Vergari
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To visualise what the possible distributions found by minimising the semantic loss look like, we compare independent distributions and expressive distributions on the traffic light problem in Figure 3. We modelled independent distributions with two real-valued parameters and a sigmoid, and expressive distributions with 4 real-valued parameters and a softmax. Then, we minimise the semantic loss to ensure p(r, g) = 0. We use gradient descent for 10,000 iterations with a learning rate of 0.1. |
| Researcher Affiliation | Academia | 1School of Informatics, University of Edinburgh. |
| Pseudocode | No | The paper does not contain any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any explicit statement about releasing open-source code or a link to a code repository. |
| Open Datasets | Yes | Example 2.1 (Learning with algorithms). MNIST Addition is a popular benchmark task in neurosymbolic learning (Manhaeve et al., 2021). |
| Dataset Splits | No | The paper mentions using a 'labelled dataset' and an 'unlabelled dataset' but does not specify explicit train/validation/test splits, percentages, or sample counts. |
| Hardware Specification | No | The paper does not specify the hardware used for running the experiments or visualizations (e.g., CPU, GPU models, memory). |
| Software Dependencies | No | The paper mentions using 'gradient descent' and refers to neural networks, but it does not specify any software names with version numbers (e.g., PyTorch 1.x, TensorFlow 2.x, Python 3.x). |
| Experiment Setup | Yes | We use gradient descent for 10,000 iterations with a learning rate of 0.1. ... we use the loss function Lα(θ) = (1 α)L(θ) αH(pθ|φ), where we compute H(pθ|φ) = 1/3(pθ( r, g) + pθ(r, g) + pθ( r, g)). We plot the results in Figure 8 for the independent model and various values of α. |