reproducibilityindex.ai

Deep Regression Unlearning

Authors: Ayush Kumar Tarun, Vikram Singh Chundawat, Murari Mandal, Mohan Kankanhalli

ICML 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct regression unlearning experiments for computer vision, natural language processing and forecasting applications. Our methods show excellent performance for all these datasets across all the metrics. Source code: https://github.com/ayu987/ deep-regression-unlearning
Researcher Affiliation	Collaboration	1Mavvex Labs, India 2School of Computer Engineering, Kalinga Institute of Industrial Technology Bhubaneswar, India 3School of Computing, National University of Singapore.
Pseudocode	Yes	Algorithm 1 Blindspot Unlearning; Algorithm 2 Gaussian-Amnesiac Learning
Open Source Code	Yes	Source code: https://github.com/ayu987/ deep-regression-unlearning
Open Datasets	Yes	We use four datasets in our experiments. Two computer vision datasets are used: i. Age DB (Moschoglou et al., 2017) contains 16,488 images of 568 subjects with age labels between 1 and 101, ii. IMDB-Wiki (Rothe et al., 2015) contains 500k+ images with age labels varying from 1 to 100. One NLP dataset is used: iii. Semantic Text Similarity Benchmark (STS-B) Sem Eval-2017 dataset (Cer et al., 2017) has around 7200 sentence pairs and labels corresponding to the similarity between them on a scale of 0 to 5 categorized by genre and year. One forecasting dataset is used: iv. UCI Electricity Load dataset (Yu et al., 2016) contains data of electricity consumption of 370 customers, aggregated on an hourly level.
Dataset Splits	No	The paper mentions training data and test data, but does not explicitly detail validation splits. For example, it says: 'We train the model for 100 epochs with initial learning rate of 0.01 and reduce it on plateau by a factor of 0.1.' which implies a validation set is used but it's not explicitly stated how the data is split into train/validation/test.
Hardware Specification	Yes	All the experiments are performed on NVIDIA Tesla-A100 (80GB).
Software Dependencies	No	The paper does not specify software dependencies with version numbers (e.g., Python 3.x, PyTorch 1.x).
Experiment Setup	Yes	We train the model for 100 epochs with initial learning rate of 0.01 and reduce it on plateau by a factor of 0.1. The models are optimized on L1-loss with Adam optimizer. In Fine Tune, 5 epochs of training is done with a learning rate of 0.001. We run gradient ascent for 1 epoch with a learning rate of 0.001 on the Age DB dataset. In Gaussian Amnesiac, 1 epoch of amnesiac learning is done with a learning rate of 0.001. In Blindspot, the blindspot model is trained for 2 epochs with a learning rate of 0.01. Subsequently, 1 epoch of unlearning is performed on the original model with a learning rate of 0.001.