Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Towards Explaining Distribution Shifts
Authors: Sean Kulinski, David I. Inouye
ICML 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In section 5, we show empirical results on real-world tabular, text, and image-based datasets demonstrating how our explanations can aid an operator in understanding how a distribution has shifted. |
| Researcher Affiliation | Academia | 1Department of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, USA. |
| Pseudocode | Yes | Algorithm 1 Finding k-Sparse Maps; Algorithm 2 Solving for k-Cluster Mappings |
| Open Source Code | Yes | Code to recreate the experiments can be found at https://github.com/inouye-lab/explaining-distribution-shifts. |
| Open Datasets | Yes | US Census Adult Income dataset (Kohavi & Becker, 1996); Civil Comments Dataset (Borkan et al., 2019); WILDS Camelyon17 dataset (Bandi et al., 2018); UCI Breast Cancer Wisconsin (Original) dataset (Mangasarian & Wolberg, 1990); MNIST digits (Deng, 2012); Celeb A dataset (Liu et al., 2015) |
| Dataset Splits | No | The paper describes how datasets were used for training and analysis (e.g., 'We trained DIVA on the Shifted Multi-MNIST dataset for 600 epochs' and 'The SSVAE was trained for 200 epochs...with 80% of the labels available'), but it does not specify explicit validation splits (e.g., percentages or counts for a validation set) for model tuning or hyperparameter selection separate from the training data. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running experiments, such as particular GPU or CPU models, memory specifications, or cloud computing instance types. |
| Software Dependencies | No | The paper mentions implementing methods and training models but does not list specific software dependencies with version numbers (e.g., 'PyTorch 1.9', 'Python 3.8'). |
| Experiment Setup | Yes | We trained DIVA on the Shifted Multi-MNIST dataset for 600 epochs with a KL-β value of 10 and latent dimension of 64 for each of the three sub-spaces.; The SSVAE was trained for 200 epochs on a concatenation of both Psrc and Ptgt with 80% of the labels available per environment, and a batch size of 128 |