Two-sided fairness in rankings via Lorenz dominance

Authors: Virginie Do, Sam Corbett-Davies, Jamal Atif, Nicolas Usunier

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments also show that it increases the utility of the worse-off at lower costs in terms of overall utility. We report experimental results on music and friend recommendation tasks, where we analyze the trade-offs obtained by different methods by looking at different points of their Lorenz curves. Our welfare approach generates a wide variety of trade-offs, and is, in particular, more effective at improving the utility of worse-off users than the baselines. Section 5 Experiments
Researcher Affiliation Collaboration 1Facebook AI 2LAMSADE, Université PSL, Université Paris Dauphine, CNRS, France virginiedo@fb.com, scd@fb.com, jamal.atif@dauphine.psl.eu, usunier@fb.com
Pseudocode No The paper describes the Frank-Wolfe algorithm and its steps in text (Section 4) but does not provide a formal pseudocode block or algorithm box.
Open Source Code No The paper does not contain any explicit statement about releasing source code or a link to a code repository for the methodology described.
Open Datasets Yes We present here our experiments with the Lastfm-2k dataset [9, 47], which contains the music listening histories of 1.9k users. We present in App. F.3 results using the Movie Lens-20m dataset [24]. We generate an artificial task based on the Higgs Twitter dataset [15].
Dataset Splits No The paper states: 'We split the data in two parts: 80% for training and 20% for testing' (Appendix F.1). However, it does not mention a separate validation split, only training and testing.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments.
Software Dependencies No The paper mentions using a 'matrix factorization algorithm' but does not specify any software libraries, frameworks, or their version numbers used in the implementation.
Experiment Setup No The paper mentions aspects of the experimental protocol such as dataset selection and splitting (e.g., 'We select the top 2500 items most listened to, and estimate preferences with a matrix factorization algorithm using a random sample of 80% of the data.'), but it does not specify concrete hyperparameters (e.g., learning rate, batch size, optimizer settings) or other low-level configuration details of the models or algorithms used.