Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Sum-of-Parts: Self-Attributing Neural Networks with End-to-End Learning of Feature Groups

Authors: Weiqiu You, Helen Qu, Marco Gatti, Bhuvnesh Jain, Eric Wong

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct experiments using our framework on image and text tasks to see if our theory-informed framework actually uses the learned groups for (RQ1) high performance.
Researcher Affiliation	Academia	1 Department of Computer and Information Science, University of Pennsylvania, Philadelphia, PA, USA 2 Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, PA, USA. Correspondence to: Weiqiu You <EMAIL>, Eric Wong <EMAIL>.
Pseudocode	Yes	Algorithm 1 The Sum-of-Parts Framework
Open Source Code	Yes	Our code is available at https://github.com/BrachioLab/sop
Open Datasets	Yes	We evaluate on two vision and one language datasets: Image Net-S (Gao et al., 2022) image classification using Vision Transformer (Dosovitskiy et al., 2021) backbone, Cosmo Grid (Kacprzak et al., 2023) cosmology image regression using CNN (Matilla et al., 2020), and Multi RC (Khashabi et al., 2018) text classification using BERT (Devlin et al., 2019).
Dataset Splits	Yes	We evaluate on two vision and one language datasets: Image Net-S (Gao et al., 2022) image classification using Vision Transformer (Dosovitskiy et al., 2021) backbone, Cosmo Grid (Kacprzak et al., 2023) cosmology image regression using CNN (Matilla et al., 2020), and Multi RC (Khashabi et al., 2018) text classification using BERT (Devlin et al., 2019). We use a CNN (Matilla et al., 2020) trained on the Cosmo Grid training set.
Hardware Specification	Yes	We use NVIDIA A100 with 80G memory and NVIDIA A6000 with 48G memory on our internal cluster for experiments.
Software Dependencies	No	The paper mentions software like "Vision Transformer" and "BERT" but does not provide specific version numbers for these or any other key software libraries or frameworks used in the implementation.
Experiment Setup	Yes	For SOP model training, we use learning rate of 5e-6 for all experiments. For Image Net-S and Multi RC, we use one head in the attention in the group generator. For Cosmo Grid, we use four heads. ... For Image Net, we train for 1 epoch, but taking the checkpoint at 0.5 epoch as it already converges, while we train for 3 epochs for Cosmo Grid and 20 epochs for Multi RC.