Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Leveraging Sparse Linear Layers for Debuggable Deep Networks
Authors: Eric Wong, Shibani Santurkar, Aleksander Madry
ICML 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate that it is possible to construct deep networks that have sparse decision layers (e.g., with only 20-30 deep features per class for Image Net) without sacrificing much model performance. This involves developing a custom solver for fitting elastic net reg ularized linear models in order to perform effective sparsification at deep-learning scales.2 We show that sparsifying a network s decision layer can indeed help humans understand the resulting mod els better. For example, untrained annotators can intuit (simulate) the predictions of a model with a sparse decision layer with high ( 63%) accuracy. This is in contrast to their near chance performance ( 33%) for models with standard (dense) decision layers. We explore the use of sparse decision layers in three debugging tasks: diagnosing biases and spurious corre lations (Section 4.1), counterfactual generation (Sec tion 4.2) and identifying data patterns that cause misclassifications (Section 4.3). To enable this analysis, we design a suite of human-in-the-loop experiments. |
| Researcher Affiliation | Academia | 1 Massachusetts Institute of Technology, Cambridge, Massachusetts, USA. Correspondence to: Eric Wong <EMAIL>, Shibani Santurkar <EMAIL>. |
| Pseudocode | No | The paper describes algorithms and methods but does not provide formal pseudocode blocks or algorithms. |
| Open Source Code | Yes | 1 The code for our toolkit can be found at https://github. com/madrylab/debuggabledeepnetworks. 2 A standalone package of our solver is available at https: //github.com/madrylab/glm_saga |
| Open Datasets | Yes | We perform our analysis on: (a) Res Net-50 classifiers (He et al., 2016) trained on Image Net-1k (Deng et al., 2009; Rus sakovsky et al., 2015) and Places-10 (a 10-class subset of Places365 (Zhou et al., 2017)); and (b) BERT (Devlin et al., 2018) for sentiment classification on Stanford Sentiment Treebank (SST) (Socher et al., 2013) and toxicity classifica tion of Wikipedia comments (Wulczyn et al., 2017). |
| Dataset Splits | Yes | For the rest of our study, we select a single sparse decision layer to balance performance and sparsity specifically the sparsest model whose accuracy is within 5% of top valida tion set performance (details in Appendix D.1.1). |
| Hardware Specification | No | The paper does not provide specific hardware specifications (e.g., GPU model, CPU type) used for experiments. |
| Software Dependencies | No | The paper does not list specific software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x). |
| Experiment Setup | Yes | For the rest of our study, we select a single sparse decision layer to balance performance and sparsity specifically the sparsest model whose accuracy is within 5% of top valida tion set performance (details in Appendix D.1.1). |