Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Unlearning during Learning: An Efficient Federated Machine Unlearning Method

Authors: Hanlin Gu, Gongxi Zhu, Jie Zhang, Xinyuan Zhao, Yuxing Han, Lixin Fan, Qiang Yang

IJCAI 2024 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted extensive experiments on MNIST, CIFAR10, and CIFAR100 datasets to evaluate the performance of Fed AU. The results demonstrate that Fed AU effectively achieves the desired unlearning effect while maintaining model accuracy.
Researcher Affiliation	Collaboration	Hanlin Gu1 , Gongxi Zhu2,3 , Jie Zhang4 , Xinyuan Zhao3 , Yuxing Han3 , Lixin Fan1 and Qiang Yang1 1AI Lab, Webank , 2 University of Electronic Science Technology of China , 3 Shenzhen International Graduate School, Tsinghua University , 4 Nanyang Technological University
Pseudocode	Yes	Algorithm 1 Unlearning Sample in FL ( Learning Module , Auxiliary Unlearning Module and Linear Operation ) Algorithm 2 Unlearning Class in FL ( Learning Module , Auxiliary Unlearning Module and Linear Operation )
Open Source Code	No	The paper does not provide an explicit statement or link to open-source code for the described methodology.
Open Datasets	Yes	We conduct experiments on three datasets: MNIST [Le Cun et al., 2010], CIFAR10 and CIFAR100 [Krizhevsky et al., 2014].
Dataset Splits	No	The paper mentions proportions for unlearning samples (5%, 10%, 20%) and unlearning client data (20%, 50%, 100%), but does not specify standard training, validation, and test dataset splits with percentages or counts.
Hardware Specification	No	The paper does not explicitly describe the hardware used for running its experiments, such as specific GPU or CPU models.
Software Dependencies	No	The paper does not provide specific software dependencies with version numbers (e.g., Python, PyTorch versions) required to replicate the experiment.
Experiment Setup	Yes	We simulate a HFL scenario consisting 10 clients under IID and Non-IID setting [Li et al., 2022] (following the Dirichlet distribution, dir(γ)). For unlearning samples, we employed the backdoor technique to generate the unlearning samples [Gao et al., 2022]. The proportion of unlearning samples was set to 5%, 10%, and 20% of the dataset. For unlearning a client, we considered scenarios where the data from the unlearning client accounted for 20%, 50%, and 100% of the data from the other clients. ... We adopt Le Net [Le Cun et al., 1998] for conducting experiments on MNIST and adopt Alex Net [Krizhevsky et al., 2012] on CIFAR10 and Res Net18 [He et al., 2016] on CIFAR100. ... The main results on the CIFAR10 dataset are presented in Tab. 2. From these results, we can draw two conclusions: 1. Among all the schemes, the Retraining scheme and schemes involving fine-tuning operations consume considerably more time compared to other methods; 2. Although the Amnesiac and Fed Recovery scheme requires a relatively small amount of time for unlearning, they still several orders of magnitude slower than Fed AU; 3. Fed AU results in minimal additional training time, e.g, additional 2s for Alex Net-CIFAR10.