Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
OrdShap: Feature Position Importance for Sequential Black-Box Models
Authors: Davin Hill, Brian Hill, Aria Masoomi, Vijay Nori, Robert E. Tillman, Jennifer Dy
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We empirically evaluate Ord Shap on its ability to identify the importance of a given sample s feature ordering. In 6.1 we quantitatively evaluate Ord Shap against existing attribution methods. In 6.2 we investigate Ord Shap on a synthetic dataset. In 6.3 we qualitatively compare Ord Shap with Kernel SHAP. In 6.4 we present execution time results. |
| Researcher Affiliation | Collaboration | Davin Hill Northeastern University Brian L. Hill Age Bold Aria Masoomi Northeastern University Vijay S. Nori Optum AI Robert E. Tillman Optum AI Jennifer Dy Northeastern University |
| Pseudocode | Yes | Algorithm 1 Ord Shap Sampling Algorithm |
| Open Source Code | No | All source code will be provided for the review process. Public release of the source code is contingent on an internal review process and will be released if/when possible after acceptance. |
| Open Datasets | Yes | We evaluated Ord Shap on two EHR datasets (MIMICIII [34] and EICU [57]) and a natural language dataset (IMDB [45]). |
| Dataset Splits | Yes | We split the patients into a training set (80%) and test set (20%). |
| Hardware Specification | Yes | All experiments were performed on an internal cluster using AMD 7302 16-Core processors and NVIDIA A100 GPUs. |
| Software Dependencies | No | We train a modified BERT model [19] using the Huggingface library [85] with 6 attention heads, 3 hidden layers of width 384, and a dropout rate of 0.5. We use the implementation from the Captum library [37] in our experiments. We use the official implementation of Kernel SHAP from the SHAP library. |
| Experiment Setup | Yes | After preprocessing, we train a modified BERT model [19] using the Huggingface library [85] with 6 attention heads, 3 hidden layers of width 384, and a dropout rate of 0.5. |