Bayesian Decision Process for Budget-efficient Crowdsourced Clustering

Authors: Xiaozhou Wang, Xi Chen, Qihang Lin, Weidong Liu

IJCAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We provide simulation studies and real data analysis to demonstrate the performance of the proposed method.
Researcher Affiliation Academia 1School of Mathematical Sciences, Shanghai Jiao Tong University, China 2Stern School of Business, New York University, USA 3Tippie College of Business, University of Iowa, USA 4Mo E Key Lab of Artificial Intelligence, Shanghai Jiao Tong University, China
Pseudocode Yes Algorithm 1 The Opt-KG policy for crowdsourced clustering with reliable workers; Algorithm 2 Bayesian decision process with unreliable workers based on the Opt-KG policy
Open Source Code No The paper does not explicitly state that source code for the described methodology is provided, nor does it include a link to a code repository.
Open Datasets Yes We compare different policies for clustering four datasets [Dua and Graff, 2017; Bagnall et al., 2019]: soybean, olive oil, meat and iris.
Dataset Splits No The paper reports clustering accuracies and NMI scores based on experiments, but it does not explicitly specify the training, validation, and test dataset splits (e.g., percentages or sample counts) needed for reproduction.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., GPU models, CPU types, cloud platforms) used to run the experiments.
Software Dependencies No The paper mentions utilizing 'the graph partitioning algorithm [Hespanha, 2004] based on spectral factorization to solve the Min-K-Cut problem', which implies software usage (likely Matlab as per the citation), but no specific software versions or dependencies are listed.
Experiment Setup Yes We assume that the total budget T is 200 and the similarity parameter θij between items in the same cluster is generated from Beta(as, bs)... We compare the performance of the proposed Opt KG policy with the KG policy and the random sampling policy (select item pairs randomly at every stage). We choose the simulated data with a larger size and conduct each policy with a mini-batch of size B = 100 and a total budget of T = 25... The uniform prior Beta(1,1) is used in the Opt-KG policy with the total budget T = 10, the batch size B = 200 for soybean and olive oil and B = 300 for meat and iris.