Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Preserving Task-Relevant Information Under Linear Concept Removal
Authors: Floris Holstege, Shauli Ravfogel, Bram Wouters
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirically, SPLINCE outperforms baselines on benchmarks such as Bias in Bios and Winobias, removing protected attributes while minimally damaging main-task information. |
| Researcher Affiliation | Academia | University of Amsterdam, Department of Quantitative Economics New York University, Center for Data Science Tinbergen Institute |
| Pseudocode | No | The paper contains mathematical theorems and proofs but does not include any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | See this link for our code for the experiments, as well as an implementation of SPLINCE. |
| Open Datasets | Yes | We use the Bias in Bios dataset on professions and biographies from De-Arteaga et al. [2019].; We use the Multilingual Text Detoxification dataset from Dementieva et al. [2024].; We use a dataset from Limisiewicz et al. [2024], which we refer to as the profession dataset.; We use the Winobias dataset from Zhao et al. [2018].; We conduct an experiment for the Celeb A dataset [Liu et al., 2015]; Waterbirds dataset: introduced by Sagawa et al. [2020]. |
| Dataset Splits | Yes | We subsample 75,000 observations for the training set, 10,000 for the validation set, and 25,000 for the test set. For all three sets, we subsample such that p(yprof = 1) = 0.5. |
| Hardware Specification | No | The computer resources used for this paper were very modest compared to nowadays standards and therefore not mentioned explicitly. |
| Software Dependencies | No | we use a pre-trained BERT model implemented in the transformers package [Wolf et al., 2019]: Bert For Sequence Classification.from_pretrained("bert-base-uncased"). |
| Experiment Setup | Yes | training with a batch size of 16, learning rate of 10 5 and a weight decay of 10 6, using an SGD optimizer, for 2 epochs. |