reproducibilityindex.ai

Cookie Consent Has Disparate Impact on Estimation Accuracy

Authors: Erik Miehling, Rahul Nair, Elizabeth Daly, Karthikeyan Natesan Ramamurthy, Robert Redmond

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We empirically demonstrate that when consent rates exhibit demographic-dependence, user consent has a disparate impact on the recommender agent s ability to estimate users latent attributes. Empirical results were obtained via a simulator based on Rec Sim (21).
Researcher Affiliation	Industry	IBM Research erik.miehling@ibm.com {rahul.nair,elizabeth.daly}@ie.ibm.com {knatesa,rredmond}@us.ibm.com
Pseudocode	Yes	Expressions for the Bayesian updates and expected binomial probabilities can be found in Appendix A, with pseudocode of the recommendation process in Appendix B. Algorithm 1: Recommendation procedure.
Open Source Code	Yes	Our simulator was built upon Rec Sim (21) (source code at https://github.com/emiehling/cookie-consent/).
Open Datasets	No	The paper uses a simulator to generate synthetic data, as described in the "Advertisement and user samplers" section: "The advertisement sampler object (Advertisement Sampler) defines the distribution of each ad feature... Similarly, the user sampler object (User State Sampler) defines the distribution of each user feature...". It does not use or provide access to a publicly available dataset.
Dataset Splits	No	The paper's experiments are based on simulations generating synthetic data, as detailed in the "User model" and "Advertisement and user samplers" sections. Therefore, it does not specify traditional training, validation, and test dataset splits in the context of a fixed dataset.
Hardware Specification	Yes	Simulations were run in Python 3.8 on an Intel(R) Xeon(R) CPU E5-2667 v2 (3.30GHz).
Software Dependencies	Yes	Our simulator was built upon Rec Sim (21)... Simulations were run in Python 3.8...
Experiment Setup	Yes	Base model parameters assumed throughout this section are: number of users n = 1000, number of ads m = 200, ad pool size l = 50, and number of cohorts d = 2. Estimation is carried out via stochastic gradient descent with a learning rate of 0.01, regularization weight of 0.01, and a stopping threshold on the mean-squared error of εthresh = 0.001. Latent factors are assumed to be of dimension k = 50.