Learning to Incentivize: Eliciting Effort via Output Agreement

Authors: Yang Liu, Yiling Chen

IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical We summarize our main contributions as follows: Since the quality of answers is endogenously determined, a requester s utility depends on the behavior of participants. Optimizing the payment level requires an understanding of the participant s behavior. We characterize Bayesian Nash equilibria (BNE) for two output agreement mechanisms with any given level of payment and show that at equilibrium there is a unique threshold strategy for positive effort exertion. For the static setting where the requester knows the cost distribution, when the cost distribution satisfies certain conditions, we show that the optimal payment level in the two output agreement mechanisms is a solution to a convex program and hence can be efficiently solved. For the dynamic setting where the requester doesn t know the cost distribution, we design a sequential mechanism that combines eliciting and learning the cost distribution with incentivizing effort exertion in a variant of output agreement mechanism. Our mechanism ensures that participants truthfully report their cost of effort exertion when asked, in addition to following the same strategy on effort exertion and answer reporting as that in the static setting for each task. We further prove performance guarantee of this mechanism in terms of the requester s regret on expected utility. All omitted proofs can be found in [Liu and Chen, 2016].
Researcher Affiliation Academia Yang Liu and Yiling Chen Harvard University, Cambridge MA, USA {yangl,yiling}@seas.harvard.edu
Pseudocode Yes Mechanism 1 (M Crowd) For each step t: 1. Assign the task, and workers then report costs. Denote the reports as ( c1(t), ..., c N(t)). This is a voluntary procedure. A worker i can choose to not report his cost, in which case, the requester sets ci(t) := cmax. 2. Data requester randomly selects a threshold c (t) uniformly from the support [0, cmax], such that only the workers who reported ci(t) c (t) will be considered for bonus following PA; others will be given a bonus according to a probability that is independent of workers report (see Remarks for details). 3. The requester estimates a bonus level Bi(t) for each worker i that corresponds to the threshold level c (t) under PA, using only the data collected from user j 6= i, and from all previous stages. This is done via estimating F( ) first and then plugging it into Eqn.(2). Then the requester adds a positive perturbation δ(t) that is time dependent to Bi(t): Bi(t) := Bi(t)+δ(t). 4. The data requester will then announce the bonus bundle [B1(t), ..., BN(t)].
Open Source Code No The paper does not provide any statement or link regarding the availability of open-source code for the described methodology.
Open Datasets No The paper is theoretical and does not conduct experiments with a dataset, therefore no information about publicly available training data is provided.
Dataset Splits No The paper is theoretical and does not conduct experiments, so it does not provide training/test/validation dataset splits.
Hardware Specification No The paper is theoretical and does not provide any specific hardware specifications used for experiments.
Software Dependencies No The paper is theoretical and does not mention specific software dependencies with version numbers.
Experiment Setup No The paper is theoretical and does not detail an experimental setup including hyperparameters or training configurations.