Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Learning to Reason: Leveraging Neural Networks for Approximate DNF Counting
Authors: Ralph Abboud, Ismail Ceylan, Thomas Lukasiewicz3097-3104
AAAI 2020 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct experiments to validate our method, and show that our model learns and generalizes very well to large-scale #DNF instances. |
| Researcher Affiliation | Academia | Ralph Abboud, Ismail Ilkan Ceylan, Thomas Lukasiewicz Department of Computer Science University of Oxford, UK EMAIL |
| Pseudocode | No | The paper describes a "Message Passing Protocol" with steps and provides a visual representation in Figure 1, but it does not contain structured pseudocode or an algorithm block. |
| Open Source Code | No | The paper provides an arXiv link (arxiv.org/pdf/1904.02688.pdf) to an extended version of the paper, not a direct link to open-source code for the described methodology. There is no other statement regarding code availability. |
| Open Datasets | No | The paper states: "Owing to the lack of standardized benchmarks, we generate synthetic formulas using a novel randomized procedure designed to produce more variable formulas. We generate 100K distinct training formulas, where formula counts per n are shown in Table 1." While the generation procedure is described, the specific generated dataset files used are not made publicly available or linked. |
| Dataset Splits | No | The paper describes the generation of "100K distinct training formulas" and various "test set" formulas, but it does not explicitly mention or provide details for a separate validation dataset split. |
| Hardware Specification | Yes | We train the system for 4 epochs on a P100 GPU using KL divergence loss... KLM and the GNN ran over different hardware (Haswell E5-2640v3 CPU vs. P100 GPU, resp.). |
| Software Dependencies | No | The paper mentions using the "Adam optimizer" but does not provide specific version numbers for Adam or any other software libraries or frameworks used (e.g., PyTorch, TensorFlow, Python version). |
| Experiment Setup | Yes | For all experiments, we use k = 128-dimensional vector representations. We deο¬ne fenc as a 3-layer MLP with layer sizes 8, 32, and 128, message-generating MLPs (Ml, Mc, and Md) as 4-layer MLPs with 128-sized layers, and fout as a 3-layer MLP with layers of size 32, 8, and 2... We train the system for 4 epochs on a P100 GPU using KL divergence loss, the Adam optimizer (Kingma and Ba 2015), a learning rate of Ξ» = 10 5, a gradient clipping ratio of 0.5, and T = 8 message passing iterations. |