IWBVT: Instance Weighting-based Bias-Variance Trade-off for Crowdsourcing

Authors: Wenjun Zhang, Liangxiao Jiang, Chaoqun Li

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To validate the effectiveness of IWBVT, we conduct a series of experiments on the whole 34 simulated and 2 real-world crowdsourced datasets published on the Crowd Environment and its Knowledge Analysis (CEKA) [33] platform.
Researcher Affiliation Academia Wenjun Zhang School of Computer Science China University of Geosciences Wuhan 430074, China wjzhang@cug.edu.cn Liangxiao Jiang School of Computer Science China University of Geosciences Wuhan 430074, China ljiang@cug.edu.cn Chaoqun Li School of Mathematics and Physics China University of Geosciences Wuhan 430074, China chqli@cug.edu.cn
Pseudocode Yes Algorithm 1 The learning process of IWBVT
Open Source Code Yes Our codes and datasets are available at https://github.com/jiangliangxiao/IWBVT.
Open Datasets Yes To validate the effectiveness of IWBVT, we conduct a series of experiments on the whole 34 simulated and 2 real-world crowdsourced datasets published on the Crowd Environment and its Knowledge Analysis (CEKA) [33] platform.
Dataset Splits Yes For each simulation, we evaluate the original model quality and the corresponding model quality improved using IWBVT through stratified 10-fold cross-validation.
Hardware Specification Yes All experiments are conducted on a Windows 10 machine with an AMD Athlon(tm) X4 860K Quad Core Processor @ 3.70 GHz and 16 GB of RAM.
Software Dependencies No The paper mentions using 'Waikato Environment and Knowledge Analysis (WEKA)' for missing value replacement and 'Naive Bayes (NB)' as the target model, but does not specify their version numbers or other software dependencies with versions.
Experiment Setup Yes For each simulation, we evaluate the original model quality and the corresponding model quality improved using IWBVT through stratified 10-fold cross-validation. Here, we use Naive Bayes (NB) [5] as the target model. The above processes are repeated ten times independently for each algorithm on each dataset.