Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

OrdShap: Feature Position Importance for Sequential Black-Box Models

Authors: Davin Hill, Brian Hill, Aria Masoomi, Vijay Nori, Robert E. Tillman, Jennifer Dy

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We empirically evaluate Ord Shap on its ability to identify the importance of a given sample s feature ordering. In 6.1 we quantitatively evaluate Ord Shap against existing attribution methods. In 6.2 we investigate Ord Shap on a synthetic dataset. In 6.3 we qualitatively compare Ord Shap with Kernel SHAP. In 6.4 we present execution time results.
Researcher Affiliation Collaboration Davin Hill Northeastern University Brian L. Hill Age Bold Aria Masoomi Northeastern University Vijay S. Nori Optum AI Robert E. Tillman Optum AI Jennifer Dy Northeastern University
Pseudocode Yes Algorithm 1 Ord Shap Sampling Algorithm
Open Source Code No All source code will be provided for the review process. Public release of the source code is contingent on an internal review process and will be released if/when possible after acceptance.
Open Datasets Yes We evaluated Ord Shap on two EHR datasets (MIMICIII [34] and EICU [57]) and a natural language dataset (IMDB [45]).
Dataset Splits Yes We split the patients into a training set (80%) and test set (20%).
Hardware Specification Yes All experiments were performed on an internal cluster using AMD 7302 16-Core processors and NVIDIA A100 GPUs.
Software Dependencies No We train a modified BERT model [19] using the Huggingface library [85] with 6 attention heads, 3 hidden layers of width 384, and a dropout rate of 0.5. We use the implementation from the Captum library [37] in our experiments. We use the official implementation of Kernel SHAP from the SHAP library.
Experiment Setup Yes After preprocessing, we train a modified BERT model [19] using the Huggingface library [85] with 6 attention heads, 3 hidden layers of width 384, and a dropout rate of 0.5.