Learning to Generate Inversion-Resistant Model Explanations
Authors: Hoyong Jeong, Suyoung Lee, Sung Ju Hwang, Sooel Son
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate GNIME using four datasets: (1) Celeb A [17], (2) MNIST [14], (3) CIFAR10 [13], and (4) Image Net [6] , each of which is freely available for research purposes. We demonstrate that GNIME significantly decreases the information leakage in model explanations, decreasing transferable classification accuracy in facial recognition models by up to 84.8% while preserving the original functionality of model explanations. |
| Researcher Affiliation | Academia | Hoyong Jeong, Suyoung Lee, Sung Ju Hwang, Sooel Son KAIST {yongari38, suyoung.lee, sjhwang82, sl.son}@kaist.ac.kr |
| Pseudocode | Yes | Algorithm 1 Training algorithm in Phase I |
| Open Source Code | Yes | To facilitate further research, we publish GNIME at https://github.com/WSP-LAB/GNIME. |
| Open Datasets | Yes | We evaluate GNIME using four datasets: (1) Celeb A [17], (2) MNIST [14], (3) CIFAR10 [13], and (4) Image Net [6] , each of which is freely available for research purposes. |
| Dataset Splits | No | Specifically, we split this attack dataset with an 80/20 ratio for the train/test split. (No explicit mention of a validation split for model tuning). |
| Hardware Specification | Yes | All experiments took place on a system equipped with 512GBs of RAM, two Intel Xeon Gold 6258R CPUs, and four RTX 3090 GPUs. |
| Software Dependencies | No | The paper does not provide specific version numbers for software dependencies or libraries used in the experiments. |
| Experiment Setup | Yes | In LNG, we set λ = 500 for Celeb A models and λ = 100 for MNIST, CIFAR-10, and Image Net-100 models, then deploy the final model after 500 epochs. |