Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions

Authors: Fabian Fumagalli, Maximilian Muschalik, Patrick Kolpaczki, Eyke Hüllermeier, Barbara Hammer

ICML 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We conduct multiple experiments to compare Kernel SHAP-IQ with existing baseline methods for estimating SII and k-SII values. For each method, we assess estimation quality with mean-squared error (MSE; lower is better) and precision at ten (Prec@10; higher is better) compared to ground truth (GT) SII and k-SII.
Researcher Affiliation Academia 1Bielefeld University, CITEC, D-33619 Bielefeld, Germany 2LMU Munich, D-80539 Munich, Germany 3MCML, Munich 4Paderborn University, D-33098, Paderborn, Germany.
Pseudocode Yes Algorithm 1 Kernel SHAP-IQ
Open Source Code Yes 1Kernel SHAP-IQ is implemented in the open-source shapiq explanation library github.com/mmschlk/shapiq.
Open Datasets Yes IMDB dataset (Maas et al., 2011), California Housing (CH) (Kelley Pace & Barry, 1997), Image Net (Deng et al., 2009), bike rental (BR) (Fanaee-T & Gama, 2014), adult census (AC) (Kohavi, 1996).
Dataset Splits No The paper uses standard benchmark datasets, but does not explicitly provide the specific training, validation, and test dataset splits (e.g., percentages, sample counts, or explicit standard split citations) used for reproducibility.
Hardware Specification Yes All benchmarks are performed on a single a Dell XPS 15 9510 Laptop with an Intel i7-11800H clocking at 2.30GHz.
Software Dependencies No The paper mentions software like PyTorch, scikit-learn, and transformers API but does not provide specific version numbers for these dependencies.
Experiment Setup No The paper describes the models trained (e.g., 'an XGBoost regressor', 'a small neural network', 'a random forest classifier') but does not provide specific hyperparameters or detailed system-level training settings like learning rate, batch size, or optimizer configurations for these experiments.