reproducibilityindex.ai

Crowdsourcing with Multiple-Source Knowledge Transfer

Authors: Guangyang Han, Jinzheng Tu, Guoxian Yu, Jun Wang, Carlotta Domeniconi

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experimental results on real-world image and text datasets prove the effectiveness of Crowd MKT in improving the quality and reducing the budget.
Researcher Affiliation	Academia	1College of Computer and Information Sciences, Southwest University, Chongqing, China 2Department of Computer Science, George Mason University, VA, USA {gyhan, tujinzheng, gxyu,kingjun}@swu.edu.cn, carlotta@cs.gmu.edu
Pseudocode	No	No structured pseudocode or algorithm blocks (e.g., a figure or section explicitly labeled 'Pseudocode' or 'Algorithm') were found in the paper.
Open Source Code	No	The paper does not provide any specific link or statement regarding the public availability of the source code for their methodology.
Open Datasets	Yes	We study the effectiveness of our Crowd MKT through a series of experiments on two real-world datasets (20-newsgroups and CUB-200-2011 [Wah et al., 2011]) with multiple source and target domains.
Dataset Splits	No	While the paper mentions 'tuned using a validation set' for the hyperparameter τ, it does not provide specific details on the overall train/validation/test dataset splits (e.g., exact percentages, sample counts, or citations to predefined splits) needed for reproduction.
Hardware Specification	Yes	runtime is recorded on a PC with Win OS 10, AMD Ryzen 7 2700x and 16GB RAM.
Software Dependencies	No	The paper mentions software components and algorithms used (e.g., VGG-19, TF-IDF, EM algorithm, L-BFGS, sparse coding) but does not provide specific version numbers for any software dependencies (e.g., 'PyTorch 1.9', 'Python 3.8').
Experiment Setup	Yes	We fix the feature dimension of all domains to d = 1000, and set the dictionary size to k = 20. For Crowd MKT and its variants, we simply set γ = 0.1 and τ = 1. We simulate four types of workers (spammer, random, normal, and expert), with different capacity ranges and proportions as shown in Table 3. We generate 50 workers and ask each worker to give 24 annotations to the image tasks, each of which has to have at least one annotation. As a result, each task on average has ﬁve annotations.