Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Robust Concept Erasure via Kernelized Rate-Distortion Maximization
Authors: Somnath Basu Roy Chowdhury, Nicholas Monath, Kumar Avinava Dubey, Amr Ahmed, Snigdha Chaturvedi
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We conduct extensive experiments to demonstrate that KRa M is capable of erasing various types of concepts categorical, continuous, and vector-valued variables from data representations across a wide range of domains. |
| Researcher Affiliation | Collaboration | Somnath Basu Roy Chowdhury UNC Chapel Hill Nicholas Monath Google Deep Mind Avinava Dubey Google Research Amr Ahmed Google Research Snigdha Chaturvedi UNC Chapel Hill EMAIL EMAIL |
| Pseudocode | Yes | Algorithm 1 Correlation Computation Routine |
| Open Source Code | Yes | The implementation of KRa M is publicly available at https://github.com/brcsomnath/KRa M. |
| Open Datasets | Yes | Jigsaw toxicity detection dataset [1], UCI Crimes [36], DIAL dataset [7], GloVe embeddings [46], Celeb A [37], Colored MNIST [6]. These are all standard, publicly available datasets with proper citations. |
| Dataset Splits | No | This resulted in a dataset with a train/test split of (72k, 18k) for the religion concept and (106k, 26k) for the gender concept. |
| Hardware Specification | Yes | All networks were trained using a single 22GB NVIDIA Quadro RTX 6000 GPU and experiments were executed in Py Torch [44] framework. |
| Software Dependencies | No | All networks were trained using a single 22GB NVIDIA Quadro RTX 6000 GPU and experiments were executed in Py Torch [44] framework. We set these parameters by performing a grid search on the development set using Weights & Biases [11]. We use a scikit-learn MLP classifier (non-linear) [45]. |
| Experiment Setup | Yes | In our experiments, we primarily deal with two hyperparameters: regularization constant, λ (in Equation 4), and σ, associated with the standard deviation of a Gaussian kernel (k(x, y) = e x y /σ2). |