Task-aware Privacy Preservation for Multi-dimensional Data
Authors: Jiangnan Cheng, Ao Tang, Sandeep Chinchali
ICML 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments demonstrate that our task-aware approach significantly improves ultimate task accuracy compared to standard benchmark LDP approaches with the same level of privacy guarantee. We validate the effectiveness of our task-aware approach through three real-world experiments. |
| Researcher Affiliation | Academia | 1School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 2Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, TX. Correspondence to: Jiangnan Cheng <jc3377@cornell.edu>, Ao Tang <atang@cornell.edu>, Sandeep Chinchali <sandeepc@utexas.edu>. |
| Pseudocode | Yes | Algorithm 1 Task-aware Algorithm for ϵ-LDP Preservation in General Settings |
| Open Source Code | Yes | Our code is publicly available at https://github.com/chengjiangnan/task_aware_privacy. |
| Open Datasets | Yes | Three applications and corresponding datasets from the standard UCI Machine Learning Repository (Dua & Graff, 2017) are considered: mean estimation of hourly household power consumption, real estate valuation, and breast cancer detection. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml. |
| Dataset Splits | No | Table 1. Evaluation Details APPLICATION NUM OF SAMPLES TRAIN/TEST SPLIT TRAINING EPOCHS RUNTIME HOUSEHOLD POWER 1417 0.7/0.3 NA < 1 MIN REAL ESTATE 414 0.7/0.3 2000 < 2 HRS BREAST CANCER 569 0.7/0.3 2000 < 2 HRS. The paper specifies a 0.7/0.3 train/test split but does not mention a separate validation split. |
| Hardware Specification | Yes | Our evaluation runs on a personal laptop with 2.7 GHz Intel Core I5 processor and 8-GB 1867 MHz DDR3 memory. |
| Software Dependencies | No | Our code is based on Pytorch. We use the Adam optimizer and learning rate 10 3 for all the applications. The paper mentions software used (Pytorch, Adam optimizer) but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | We use the Adam optimizer and learning rate 10 3 for all the applications. The number of samples, train/test split, training epochs, and resulting runtime are summarized in Table 1. For task function f, we use a one-hidden-layer feedforward neural network with input size n, hidden size 1.5n and output size 1... The activation function used by the hidden layer and output layer is a Rectified Linear Unit (Re LU). For the encoder/decoder, we use a one-layer neural network (linear model)... We use one-hidden-layer feedforward neural network with input size n, hidden size n and output size n... The activation functions used by the hidden layer and output layer are a logistic and identity function, respectively. For the gradient-based learning algorithm... we set η = 0.2 and η = 0.001... and in both experiments, for each epoch we update θe and θd by 15 steps. |