Inference Aided Reinforcement Learning for Incentive Mechanism Design in Crowdsourcing
Authors: Zehong Hu, Yitao Liang, Jie Zhang, Zhao Li, Yang Liu
NeurIPS 2018 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Empirical results show that our mechanism performs consistently well under both rational and non-fully rational (adaptive learning) worker models. Additionally, the paper includes a section titled "5 Empirical Experiments." |
| Researcher Affiliation | Collaboration | Zehong Hu Alibaba Group, Hangzhou, China; Yitao Liang University of California, Los Angeles; Jie Zhang Nanyang Technological University; Zhao Li Alibaba Group, Hangzhou, China; Yang Liu University of California, Santa Cruz/Harvard University |
| Pseudocode | Yes | Algorithm 1 Gibbs sampling for crowdsourcing |
| Open Source Code | No | The paper does not provide any explicit statement about releasing source code or a link to a code repository for the methodology described. |
| Open Datasets | Yes | We utilize the RTE dataset, where workers need to check whether a hypothesis sentence can be inferred from the provided sentence [20]. |
| Dataset Splits | No | The paper mentions using the RTE dataset and setting environmental parameters for experiments, but it does not specify explicit training, validation, or test dataset splits or their sizes. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU, GPU models, or memory specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific software dependencies with version numbers (e.g., programming languages, libraries, or frameworks with their versions). |
| Experiment Setup | Yes | For all the experiments in this subsection, we set the environment parameters as follows: N = 10, PH = 0.9, b = 0, c H = 0.02; the set of the scaling factors is A = {0.1, 1.0, 5.0, 10}; F(A) = A10 and η = 0.001 as in the utility function (Eqn. (3)); the number of time steps for an episode is set to be 28. Meanwhile, for the adjustable parameters in our mechanism, we set the number of tasks at each step M = 100 and the exploration rate for RIL ϵ = 0.2. |