User-item fairness tradeoffs in recommendations

Authors: Sophie Greenwood, Sudalakshmee Chiniah, Nikhil Garg

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Theoretically, we develop a model of recommendations with user and item fairness objectives and characterize the solutions of fairness-constrained optimization. We identify two phenomena: (a) when user preferences are diverse, there is free item and user fairness; and (b) users whose preferences are misestimated can be especially disadvantaged by item fairness constraints. Empirically, we prototype a recommendation system for preprints on ar Xiv and implement our framework, measuring the phenomena in practice and showing how these phenomena inform the design of markets with recommendation systems-intermediated matching.
Researcher Affiliation Academia Sophie Greenwood Computer Science Cornell Tech Sudalakshmee Chiniah Operations Research and Information Engineering Cornell Tech Nikhil Garg Operations Research and Information Engineering Cornell Tech
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks. It describes mathematical models and theoretical frameworks but without formal algorithmic representations.
Open Source Code Yes Our code is available at the following repository: https://github.com/vschiniah/Ar Xiv_Recommendation_Research.
Open Datasets Yes We use data from ar Xiv and Semantic Scholar [1, 25]. As training for user preferences, we consider 139,308 CS papers by 178,260 distinct authors before 2020; as the items to be recommended, we consider the 14,307 papers uploaded to ar Xiv in 2020.
Dataset Splits Yes As training for user preferences, we consider 139,308 CS papers by 178,260 distinct authors before 2020; as the items to be recommended, we consider the 14,307 papers uploaded to ar Xiv in 2020. We apply two natural language processing-based models TF-IDF [28] and the sentence transformer model SPECTER [13] to textual features such as the paper s abstract (for both items and the user s historical papers) to generate embeddings for all papers in the training set.
Hardware Specification Yes This entire empirical workflow was run on a machine with 64 CPUs, 1 TB RAM, and 14 TB (non SSD) disk. The estimated time was about 20 hours per week for 3 months, and the longest individual run was approximately 12 hours.
Software Dependencies Yes We apply two natural language processing-based models TF-IDF [28] and the sentence transformer model SPECTER [13] to textual features... We use these embeddings to compute similarity scores... For a given value of γ, to compute U min (γ) we use the cvxpy implementation of the convex optimization algorithm SCS [38].
Experiment Setup No The paper describes the data sources, the process for generating embeddings (TF-IDF, SPECTER), and how similarity scores are computed (mean/max cosine similarity). It mentions setting the fairness constraint parameter 'γ' to 50 values between 0 and 1. However, it does not provide specific hyperparameters typically associated with training complex machine learning models, such as learning rates, batch sizes, optimizers, or number of epochs, for the embedding generation or similarity score computation.