Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

A learnability analysis on neuro-symbolic learning

Authors: Hao-Yuan He, Ming LI

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To empirically validate the theoretical results, we conducted a series of experiments, including arithmetic tasks shown in table 1 and BDD-OIA Xu et al. (2020), which is evaluated in Marconato et al. (2023b) as a realistic application. ... We empirically evaluate the learnability of Ne Sy tasks based on theorem 3.6, focusing on two key aspects: (i) validating that minimizing the Ne Sy risk consistently minimizes the concept risk for learnable tasks, and (ii) examining how DCSP solution disagreement affects learnability.
Researcher Affiliation	Academia	Hao-Yuan He, Ming Li B National Key Laboratory for Novel Software Technology, Nanjing University School of Artificial Intelligence, Nanjing University EMAIL
Pseudocode	Yes	Algorithm 1 DCSP Solution
Open Source Code	Yes	Here, we base our approach on ABLKit (Huang et al., 2024)1 and the code of He et al. (2024b)2. ... 1https://github.com/AbductiveLearning/ABLkit 2https://github.com/Hao-Yuan-He/A3BL
Open Datasets	Yes	Setup Manhaeve et al. (2018) proposed the digit addition task by incorporating the handwritten MNIST (Le Cun et al., 1994) and predefined addition rules. We extend the setup by including KMNIST (Clanuwat et al., 2018), CIFAR10 Krizhevsky (2009), and SVHN (Netzer et al., 2011), mapping class indices to digits, e.g., CIFAR-10 classes (airplane = 0, . . .) , and enriching the background knowledge as depicted in table 1. The learning model for MNIST and KMNIST is Le Net (Le Cun & Bengio, 1998), while Res Net50 (He et al., 2016) is used for CIFAR10 and SVHN. Besides that, we also adopt BDD-OIA from Bortolotti et al. (2024), which is a multi-label autonomous driving task for studying RSs in real-world, high-stakes scenarios.
Dataset Splits	No	The paper mentions controlling sample size by resampling data: "During the dataset construction process, we control the sample size by resampling data until the sequence size exceeds a threshold, denoted as sample size. For figure 2, the sample size is set to 30, 000, while for aggregation experiments it is set to 120, 000; other values are specified in the respective plots." However, it does not provide specific train/test/validation splits for the datasets used.
Hardware Specification	Yes	The results were obtained using an Intel Xeon Platinum 8538 CPU and an NVIDIA A100-PCIE-40GB GPU on an Ubuntu 20.04 platform.
Software Dependencies	No	The paper mentions software like ABLKit, Choco, PyTorch, and Ubuntu, but does not provide specific version numbers for these software components or libraries, which are required for a reproducible description of ancillary software.
Experiment Setup	Yes	Optimizer Configurations. All experiments use Adam W, a weight-decay variant of Adam (Kingma & Ba, 2015), as the optimizer, with a learning rate of 0.0015 and betas set to (0.9, 0.99). The batch size is set to 256, and unless otherwise noted, the number of epochs is set to 10. The loss function used for optimization is cross-entropy, with further details available in the support code.