A Group-Based Personalized Model for Image Privacy Classification and Labeling

Authors: Haoti Zhong, Anna Squicciarini, David Miller, Cornelia Caragea

IJCAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that, on a dataset of 114 users and about 3,400 image labelings, our model achieves an overall accuracy measure of 79.31% when a few (15) images are used to infer group associations for each (test) user.
Researcher Affiliation Academia Haoti Zhong Dept. of Electrical Eng. Pennsylvania State University hzz133@psu.edu; Anna Squicciarini Information Sciences and Technology Pennsylvania State University acs20@psu.edu; David Miller Dept. of Electrical Eng. Pennsylvania State University djmiller@engr.psu.edu; Cornelia Caragea Department of Computer Science University of North Texas ccaragea@unt.edu
Pseudocode No The paper describes the Expectation-Maximization (EM) algorithm and gradient ascent used in their model through mathematical equations and textual descriptions, but it does not include a structured pseudocode block or algorithm.
Open Source Code No The paper does not provide any explicit statements about releasing source code for their methodology or any links to a code repository.
Open Datasets No The imageset was taken from the Picalert study, a collection of images with varying degrees of sensitivity [Zerr et al., 2012]. We collected our own dataset for testing purposes as follows... In total, 114 valid user responses were collected and 3420 labels in total (2496 public labels and 924 private labels).
Dataset Splits Yes We first divide the dataset into 10 (outer) folds, and use 9 of these folds for training-plus-validation, with the last fold used for testing. To calculate the optimized hyper-parameter, we further split the collection of nine training-plus-validation fold samples, again using 5-fold cross validation, with four of these (inner) folds used for training and one for validation.
Hardware Specification No The paper mentions using Deep Learning for feature extraction and Caffe, but it does not specify any hardware details like GPU models, CPU types, or memory used for running the experiments.
Software Dependencies No The paper mentions using Caffe [Jia et al., 2014] for deep learning features and SVMs for baselines, but it does not provide specific version numbers for these or any other software dependencies, which is required for reproducibility.
Experiment Setup Yes The search grid for K is chosen from 4 to 7 with search step of 1, and M is chosen over a range from 20-50 with a search step of 5, to maximize the average (inner) validation fold CV accuracy. We found that M=40 and K = 6 fit best for this dataset. Larger M and K may be found for larger datasets (with more users). L was chosen to be the minimum number such that the patches cover 90% of the image support. Thus, L = 100.