Foundations of Symbolic Languages for Model Interpretability

Authors: Marcelo Arenas, Daniel Báez, Pablo Barceló, Jorge Pérez, Bernardo Subercaseaux

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We also present a prototype implementation of FOIL wrapped in a high-level declarative language, and perform experiments showing that such a language can be used in practice.
Researcher Affiliation Academia Marcelo Arenas1,4, Daniel Baez3, Pablo Barceló2,4, Jorge Pérez3,4, Bernardo Subercaseaux4,5 1 Department of Computer Science, PUC-Chile 2 Institute for Mathematical and Computational Engineering, PUC-Chile 3 Department of Computer Science, Universidad de Chile 4 Millennium Institute for Foundational Research on Data, Chile 5 Carnegie Mellon University, USA
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes A detailed exposition along with our implementation and a set of real examples can be found in the supplementary material.
Open Datasets Yes We tested a set of 20 handcrafted queries over decision trees with up to 400 leaves trained for the Student Performance Data Set [29], which combines Boolean and numerical features.
Dataset Splits No The paper mentions training models on 'random input data' and the 'Student Performance Data Set [29]', but does not specify any training, validation, or test dataset splits (e.g., percentages or counts).
Hardware Specification Yes All experiments where run on a personal computer with a 2.48GHz Intel N3060 processor and 2GB RAM. The exact details of the machine are presented in the supplementary material.
Software Dependencies No The paper mentions 'Scikit-learn [30] library' but does not provide specific version numbers for it or any other key software components used in the experiments.
Experiment Setup Yes We tested the efficiency of our implementation varying three different parameters: the number of input features, the number of leaves of the decision tree, and the size of the input queries. We created a set of random queries with 1 to 4 quantified variables, and a varying number of operators (60 different queries).