Crowdsourcing with Multiple-Source Knowledge Transfer

Authors: Guangyang Han, Jinzheng Tu, Guoxian Yu, Jun Wang, Carlotta Domeniconi

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experimental results on real-world image and text datasets prove the effectiveness of Crowd MKT in improving the quality and reducing the budget.
Researcher Affiliation Academia 1College of Computer and Information Sciences, Southwest University, Chongqing, China 2Department of Computer Science, George Mason University, VA, USA {gyhan, tujinzheng, gxyu,kingjun}@swu.edu.cn, carlotta@cs.gmu.edu
Pseudocode No No structured pseudocode or algorithm blocks (e.g., a figure or section explicitly labeled 'Pseudocode' or 'Algorithm') were found in the paper.
Open Source Code No The paper does not provide any specific link or statement regarding the public availability of the source code for their methodology.
Open Datasets Yes We study the effectiveness of our Crowd MKT through a series of experiments on two real-world datasets (20-newsgroups and CUB-200-2011 [Wah et al., 2011]) with multiple source and target domains.
Dataset Splits No While the paper mentions 'tuned using a validation set' for the hyperparameter τ, it does not provide specific details on the overall train/validation/test dataset splits (e.g., exact percentages, sample counts, or citations to predefined splits) needed for reproduction.
Hardware Specification Yes runtime is recorded on a PC with Win OS 10, AMD Ryzen 7 2700x and 16GB RAM.
Software Dependencies No The paper mentions software components and algorithms used (e.g., VGG-19, TF-IDF, EM algorithm, L-BFGS, sparse coding) but does not provide specific version numbers for any software dependencies (e.g., 'PyTorch 1.9', 'Python 3.8').
Experiment Setup Yes We fix the feature dimension of all domains to d = 1000, and set the dictionary size to k = 20. For Crowd MKT and its variants, we simply set γ = 0.1 and τ = 1. We simulate four types of workers (spammer, random, normal, and expert), with different capacity ranges and proportions as shown in Table 3. We generate 50 workers and ask each worker to give 24 annotations to the image tasks, each of which has to have at least one annotation. As a result, each task on average has five annotations.