Controlling privacy in recommender systems

Authors: Yu Xin, Tommi Jaakkola

NeurIPS 2014 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show theoretically and demonstrate empirically that a moderate number of public users with no access to private user information already suffices for reasonable accuracy. Moreover, we introduce a new privacy concept for gleaning relational information from private users while maintaining a first order deniability. We demonstrate gains from controlled access to private user preferences.
Researcher Affiliation Academia Yu Xin CSAIL, MIT yuxin@mit.edu Tommi Jaakkola CSAIL, MIT tommi@csail.mit.edu
Pseudocode No The paper describes algorithms (e.g., EM type algorithm, greedy coordinate descent) in narrative text but does not provide them in a structured pseudocode block or algorithm box.
Open Source Code No The paper does not provide any explicit statements about releasing source code or links to a code repository for the methodology described.
Open Datasets Yes We perform experiments on the Movielens 10M dataset which contains 10 million ratings from 69878 users on 10677 movies.
Dataset Splits No The paper states, "The test set contains 10 ratings for each user." It mentions using an "EM type algorithm for training" and that parameters are updated during training. However, it does not specify a distinct train/validation/test split for the overall dataset (e.g., 80/10/10 percentage or specific sample counts for each split), nor does it reference predefined splits with citations for the entire dataset being used.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments (e.g., GPU/CPU models, memory, or cloud instance types).
Software Dependencies No The paper does not provide specific version numbers for any software components, libraries, or solvers used in the experiments.
Experiment Setup Yes In the E-step, the current Σ is sent to public users to complete their rating vectors and send back to the server. In the M-step, Σ is then updated based on these full rating vectors. In the M-step, the weight for each private user is set to 1/2 compared to 1 for public users. During training, after processing w = 20 private users, we update parameters (µ, V ), re-complete the rating vectors of public users, making predictions for next batch of private users more accurate.