Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
Ferrari: Federated Feature Unlearning via Optimizing Feature Sensitivity
Authors: Hanlin Gu, WinKent Ong, Chee Seng Chan, Lixin Fan
NeurIPS 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results and theoretical analysis demonstrate the effectiveness of Ferrari across various feature unlearning scenarios, including sensitive, backdoor, and biased features. The code is publicly available at https://github.com/Ong Win Kent/Federated-Feature-Unlearning |
| Researcher Affiliation | Collaboration | 1CISi P, Universiti Malaya, Malaysia 2AI Lab, Webank, PR China |
| Pseudocode | Yes | Algorithm 1 Federated Feature Unlearning |
| Open Source Code | Yes | The code is publicly available at https://github.com/Ong Win Kent/Federated-Feature-Unlearning |
| Open Datasets | Yes | We employ Res Net18 [90] on image datasets: MNIST [89], Colored-MNIST (CMNIST) [89], Fashion-MNIST [91], CIFAR-10, CIFAR-20, CIFAR-100 [92] and Image Net [93]. For tabular datasets, such as Adult Census Income (Adult) [85] and Diabetes [86]... Additionally, we utilize the transformer-based BERT model [94] for the text dataset, specifically the IMDB movie reviews dataset [95]. |
| Dataset Splits | No | The paper specifies training and test set sizes for its datasets (e.g., MNIST has 60,000 training examples and 10,000 test examples; CIFAR-10 has 50,000 training examples and 10,000 test examples) but does not explicitly mention or quantify a separate validation set split. |
| Hardware Specification | Yes | We conduct experiments on a single NVIDIA A100 GPU. |
| Software Dependencies | No | The paper describes the models used (ResNet18, BERT) and general types of datasets (image, tabular, text) but does not list specific versions of software libraries (e.g., PyTorch, TensorFlow, scikit-learn) or programming languages used to implement the experiments. |
| Experiment Setup | Yes | For federated feature unlearning experiments, we set hyperparameters: learning rate η = 0.0001, sample size N = 20, and random Gaussian noise with standard deviation ranging from 0.05 σ 1.0 (see Sec. 5.5) across iterations of N. |