Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Foundations of Symbolic Languages for Model Interpretability
Authors: Marcelo Arenas, Daniel Báez, Pablo Barceló, Jorge Pérez, Bernardo Subercaseaux
NeurIPS 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We also present a prototype implementation of FOIL wrapped in a high-level declarative language, and perform experiments showing that such a language can be used in practice. |
| Researcher Affiliation | Academia | Marcelo Arenas1,4, Daniel Baez3, Pablo Barceló2,4, Jorge Pérez3,4, Bernardo Subercaseaux4,5 1 Department of Computer Science, PUC-Chile 2 Institute for Mathematical and Computational Engineering, PUC-Chile 3 Department of Computer Science, Universidad de Chile 4 Millennium Institute for Foundational Research on Data, Chile 5 Carnegie Mellon University, USA |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | A detailed exposition along with our implementation and a set of real examples can be found in the supplementary material. |
| Open Datasets | Yes | We tested a set of 20 handcrafted queries over decision trees with up to 400 leaves trained for the Student Performance Data Set [29], which combines Boolean and numerical features. |
| Dataset Splits | No | The paper mentions training models on 'random input data' and the 'Student Performance Data Set [29]', but does not specify any training, validation, or test dataset splits (e.g., percentages or counts). |
| Hardware Specification | Yes | All experiments where run on a personal computer with a 2.48GHz Intel N3060 processor and 2GB RAM. The exact details of the machine are presented in the supplementary material. |
| Software Dependencies | No | The paper mentions 'Scikit-learn [30] library' but does not provide specific version numbers for it or any other key software components used in the experiments. |
| Experiment Setup | Yes | We tested the efficiency of our implementation varying three different parameters: the number of input features, the number of leaves of the decision tree, and the size of the input queries. We created a set of random queries with 1 to 4 quantified variables, and a varying number of operators (60 different queries). |