CrowdMR: Integrating Crowdsourcing with MapReduce for AI-Hard Problems
Authors: Jun Chen, Chaokun Wang, Yiyuan Bai
AAAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To showcase the usability of Crowd MR, we introduce an example of gender classification using Crowd MR. For the instances whose confidence values are lower than a given threshold α, Crowd MR distributes HITs in the form of CAPTCHA which automatically discriminates machine and human (Ahn et al. 2008). With gradual completion of HITs, the quality of results is getting better. Due to the flexibility of Crowd MR, we can easily extend it to other AI-hard problems, e.g., word-sense disambiguation and face recognition. We summarize our major contributions as follows: (1) We integrated crowdsourcing with Map Reduce in our Crowd MR model, and the confidence was employed to dissect machine and human computation. (2) An incremental scheduling algorithm based on users votes was proposed to deal with the non-real-time property of crowdsourcing. (3) An online gender classification application based on face detection was developed to showcase the usability of Crowd MR. |
| Researcher Affiliation | Academia | Jun Chen, Chaokun Wang, and Yiyuan Bai School of Software, Tsinghua University, Beijing 100084, P.R. China chenjun14@mails.thu.edu.cn, chaokun@tsinghua.edu.cn, eldereal@gmail.com |
| Pseudocode | No | The paper describes the system architecture and phases but does not include pseudocode or an algorithm block. |
| Open Source Code | No | The paper does not provide any concrete access to source code for the methodology described. |
| Open Datasets | No | The paper mentions "gender classification" and "face detection" but does not specify the dataset used or provide any concrete access information (link, DOI, repository, or formal citation) for a publicly available or open dataset. |
| Dataset Splits | No | The paper does not provide specific dataset split information (exact percentages, sample counts, or detailed splitting methodology). |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details (e.g., library or solver names with version numbers). |
| Experiment Setup | No | The paper describes the system architecture and application but does not provide specific experimental setup details (concrete hyperparameter values, training configurations, or system-level settings). |