Learning to Incentivize: Eliciting Effort via Output Agreement
Authors: Yang Liu, Yiling Chen
IJCAI 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We summarize our main contributions as follows: Since the quality of answers is endogenously determined, a requester s utility depends on the behavior of participants. Optimizing the payment level requires an understanding of the participant s behavior. We characterize Bayesian Nash equilibria (BNE) for two output agreement mechanisms with any given level of payment and show that at equilibrium there is a unique threshold strategy for positive effort exertion. For the static setting where the requester knows the cost distribution, when the cost distribution satisfies certain conditions, we show that the optimal payment level in the two output agreement mechanisms is a solution to a convex program and hence can be efficiently solved. For the dynamic setting where the requester doesn t know the cost distribution, we design a sequential mechanism that combines eliciting and learning the cost distribution with incentivizing effort exertion in a variant of output agreement mechanism. Our mechanism ensures that participants truthfully report their cost of effort exertion when asked, in addition to following the same strategy on effort exertion and answer reporting as that in the static setting for each task. We further prove performance guarantee of this mechanism in terms of the requester s regret on expected utility. All omitted proofs can be found in [Liu and Chen, 2016]. |
| Researcher Affiliation | Academia | Yang Liu and Yiling Chen Harvard University, Cambridge MA, USA {yangl,yiling}@seas.harvard.edu |
| Pseudocode | Yes | Mechanism 1 (M Crowd) For each step t: 1. Assign the task, and workers then report costs. Denote the reports as ( c1(t), ..., c N(t)). This is a voluntary procedure. A worker i can choose to not report his cost, in which case, the requester sets ci(t) := cmax. 2. Data requester randomly selects a threshold c (t) uniformly from the support [0, cmax], such that only the workers who reported ci(t) c (t) will be considered for bonus following PA; others will be given a bonus according to a probability that is independent of workers report (see Remarks for details). 3. The requester estimates a bonus level Bi(t) for each worker i that corresponds to the threshold level c (t) under PA, using only the data collected from user j 6= i, and from all previous stages. This is done via estimating F( ) first and then plugging it into Eqn.(2). Then the requester adds a positive perturbation δ(t) that is time dependent to Bi(t): Bi(t) := Bi(t)+δ(t). 4. The data requester will then announce the bonus bundle [B1(t), ..., BN(t)]. |
| Open Source Code | No | The paper does not provide any statement or link regarding the availability of open-source code for the described methodology. |
| Open Datasets | No | The paper is theoretical and does not conduct experiments with a dataset, therefore no information about publicly available training data is provided. |
| Dataset Splits | No | The paper is theoretical and does not conduct experiments, so it does not provide training/test/validation dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not provide any specific hardware specifications used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention specific software dependencies with version numbers. |
| Experiment Setup | No | The paper is theoretical and does not detail an experimental setup including hyperparameters or training configurations. |