Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Efficient XAI: A Low-Cost Data Reduction Approach to SHAP Interpretability
Authors: Severin Bachmann
JAIR 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Through controlled experiments on synthetic datasets, we analyze the stability of SHAP values under Slovin-based subsampling across varying data characteristics, including feature and target types and distributions, and dataset sizes. Our findings reveal a U-shaped trade-off: SHAP values for midranked features remain stable, whereas extreme values exhibit higher fluctuations. |
| Researcher Affiliation | Academia | Corresponding Author. Author s Contact Information: Severin Bachmann, orcid: 0000-0001-5996-4152, EMAIL, Nuremberg Research Institute for Cooperative Studies, Nuremberg, Bavaria, Germany. |
| Pseudocode | No | The paper does not contain any explicit pseudocode or algorithm blocks. Figure 1 provides a 'Methodology overview' as a diagram, but not in a pseudocode format. |
| Open Source Code | No | The paper discusses the methodology and application of Slovin's formula but does not provide any explicit statements or links to open-source code for the described work. |
| Open Datasets | No | Through controlled experiments on synthetic datasets, we analyze the stability of SHAP values under Slovin-based subsampling across varying data characteristics, including feature and target types and distributions, and dataset sizes. Our findings reveal a U-shaped trade-off: SHAP values for midranked features remain stable, whereas extreme values exhibit higher fluctuations.The data section then outlines the generation of synthetic datasets in detail. |
| Dataset Splits | Yes | The numbers represent the test dataset sizes of the total dataset sizes illustrated in Figure 2 and they correspond to them being the 20%-fraction. |
| Hardware Specification | No | The paper mentions the general concept of GPUs' computational power in the literature review, but it does not specify any particular hardware (e.g., GPU models, CPU models, or memory) used for running the experiments described in the paper. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., specific Python libraries like PyTorch or TensorFlow with their versions) that were used to conduct the experiments. |
| Experiment Setup | No | Once the synthetic data is prepared, it is used to train several machine learning models. The selected models include linear regression (LR), extreme gradient boosting (XGB), neural networks (NN), and support vector machines (SVM). Each of these models offers unique advantages, ranging from the simplicity and interpretability of LR to the sophisticated capabilities of NN and XGB for capturing non-linear relationships and interactions. This range covers the common models applied in ML contexts.The paper lists the types of models used (LR, XGB, NN, SVM) but does not provide specific hyperparameter values or training configurations for these models. |