reproducibilityindex.ai

k-CoRating: Filling Up Data to Obtain Privacy and Utility

Authors: Feng Zhang, Victor Lee, Ruoming Jin

AAAI 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results show that such model could greatly reduce risks of being subject to Narayanan s attacks. Though it seems that k-co Rating is similar to kanonymity (Samarati 2001), k-co Rating is designed with three major differences from k-anonymity. The ﬁrst is that kco Rating handles with such dataset (typically the user-item ratings) that has a large number of attributes, any subset of which behaves in the sense of quasi-identiﬁer arised from k-anonymity. The second is that k-co Rated privacy rests in that each record, with at least k-1 ones, has and only has non-null values with respect to the same subset of attributes but they don t necessarily get the identical value under each attribute as k-anonymity requires. And the third is privacy of k-co Rating is achieved through ﬁlling up necessary NULL cells with signiﬁcant values, but not the generalization and suppression techniques. All claims are veriﬁed by experimental results.
Researcher Affiliation	Academia	Feng Zhang1, Victor E. Lee2, and Ruoming Jin3 1School of Computer Science, China University of Geosciences, Wuhan, Hubei, China 2Department of Mathematics and Computer Science, John Carroll University, University Heights, OH, USA 3Department of Computer Science , Kent State University, Kent, OH, USA
Pseudocode	Yes	Algorithm 1: sub-Ge Com: k-co Rating an Already Sorted Matrix M2; Algorithm 2: Ge Com: Generate k-co Rated Matrix M; Algorithm 3: Pa Ge Com: A Parallel Algorithm to Generate k-co Rated Matrix M
Open Source Code	No	The paper does not contain any explicit statement about releasing the source code for its methodology, nor does it provide a link to a code repository.
Open Datasets	Yes	Experiments have been done using four popular benchmark datasets: two Movielens datasets2, one Epinions dataset3 and one Netﬂix prize dataset4. 2http://www.grouplens.org/node/73 3http://www.epinions.com/ 4http://www.netﬂixprize.com
Dataset Splits	Yes	For the Movie Lens 100K dataset, we used the prepared 80%/20% splits of the dataset, i.e., u1.base and u1.test through u5.base and u5.test, to do the 5-fold cross-validation experiments; and for other datasets, we used the 10-fold cross-validation method to evaluate the prediction accuracy.
Hardware Specification	Yes	The implementation was conducted on a laptop of an Intel Core i7-2640M CPU 2.80GHz with 8GB RAM running on an Ubuntu 12.04 virtual machine with a host of Windows 8 64bit operating system. For the Netﬂix prize dataset, the laptop s computing resources were insufﬁcient, so we implemented and ran a parallel version of the algorithm Ge Com (Algorithm 3, Pa Ge Com) on the Ohio Supercomputer Center 5.
Software Dependencies	No	All the algorithms were implemented in C/C++. The implementation was conducted on a laptop of an Intel Core i7-2640M CPU 2.80GHz with 8GB RAM running on an Ubuntu 12.04 virtual machine with a host of Windows 8 64bit operating system.
Experiment Setup	Yes	For the Movie Lens 100K dataset, we used the prepared 80%/20% splits of the dataset, i.e., u1.base and u1.test through u5.base and u5.test, to do the 5-fold cross-validation experiments; and for other datasets, we used the 10-fold cross-validation method to evaluate the prediction accuracy. For the trust derivation, we computed the propagation at most two times.