Modeling content creator incentives on algorithm-curated platforms
Authors: Jiri Hron, Karl Krauth, Michael Jordan, Niki Kilbertus, Sarah Dean
ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To this end, we propose tools for numerically finding equilibria in exposure games, and illustrate results of an audit on the Movie Lens and Last FM datasets. Among else, we find that the strategically produced content exhibits strong dependence between algorithmic exploration and content diversity, and between model expressivity and bias towards gender-based user and creator groups. |
| Researcher Affiliation | Academia | Jiri Hron , , Karl Krauth , , Michael I. Jordan , Niki Kilbertus , Sarah Dean University of Cambridge, UC Berkeley, TU Munich & Helmholtz Munich, Cornell University |
| Pseudocode | No | The paper describes the gradient ascent algorithm in text but does not provide a formal pseudocode block or algorithm figure. |
| Open Source Code | Yes | implement our code in Python (van Rossum & Drake, 2009) and rely on numpy (Harris et al., 2020), scikit-surprise (Hug, 2020), pandas (pandas development team, 2020), matplotlib (Hunter, 2007), jupyter (Kluyver et al., 2016), reclab (Krauth et al., 2020), and JAX (Bradbury et al., 2018) packages. ... (see the config.py file in the provided code), which multiplies the stepsize by τ before its use, we did not use this option in the experiments. ... See optimisation.py, particularly the optax_minimisation method, for more details. ... The second-order Riemannian test (Definition 3) is implemented in manifold.py. |
| Open Datasets | Yes | We use the Movie Lens-100K and Last FM-360K datasets (Harper & Konstan, 2015; Bertin Mahieux et al., 2011; Shakespeare et al., 2020) |
| Dataset Splits | Yes | To select regularization and learning rate, we performed a two-fold 90/10 split cross-validation separately on each dataset. |
| Hardware Specification | Yes | The final Movie Lens and Last FM experiments were run on 72 AWS machines, each with 4 CPU cores, for 5 hours. |
| Software Dependencies | Yes | implement our code in Python (van Rossum & Drake, 2009) and rely on numpy (Harris et al., 2020), scikit-surprise (Hug, 2020), pandas (pandas development team, 2020), matplotlib (Hunter, 2007), jupyter (Kluyver et al., 2016), reclab (Krauth et al., 2020), and JAX (Bradbury et al., 2018) packages |
| Experiment Setup | Yes | We investigate the sensitivity of the incentivized content to the: (i) rating model {PMF, NMF}, (ii) embedding dimension d {3, 50}, and (iii) temperature log10 τ { 2, 1, 0}. We further vary the number of producers n {10, 100} to examine scenarios with different producer to consumer ratios... We employ simple gradient ascent (Singh et al., 2000; Balduzzi et al., 2018, see Appendix C.2 for comparison with gradient descent) combined with reparametrization si = θi/ θi for each producer, where we iteratively update θi,t = θi,t 1 + α θi,t 1ui(si,t 1, s\i,t 1) for shared step size α > 0... We applied early stopping when ℓ2-change in parameters between iterations dipped below 10 8 d; the number of iterations was set to 50K so convergence was achieved for every run. ... stepsize sweep was restricted to {10 2, 10 1}; the number of steps was upper bounded by 50,000 (all runs have successfully converged to a fixed point as mentioned). |