reproducibilityindex.ai

Building Hierarchies of Concepts via Crowdsourcing

Authors: Yuyin Sun, Adish Singla, Dieter Fox, Andreas Krause

IJCAI 2015 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our methodology on simulated data and on a set of real world application domains. Experimental results show that our system is robust to noise, efﬁcient in picking questions, cost-effective, and builds high quality hierarchies. In our experiments, we evaluate four aspects of our approach: (1) the performance of approximate inference to estimate the weights representing distributions over trees; (2) the efﬁciency of active vs. random query strategies in selecting questions; (3) comparison to existing work; and (4) the ability to build hierarchies for diverse application domains.
Researcher Affiliation	Academia	Yuyin Sun University of Washington sunyuyin@cs.washington.edu Adish Singla ETH Zurich adish.singla@inf.ethz.ch Dieter Fox University of Washington fox@cs.washington.edu Andreas Krause ETH Zurich krausea@ethz.ch
Pseudocode	Yes	Algorithm 1 Weight Updating Algorithm Input: W (t 1), an answer a(t), thr for stopping criterion Non-negative regularization parameters β Output: W (t) that minimizes (9) Generate samples T 1, . . . , T m from P(T\|W (t 1)) Use importance sampling to get empirical distribution π Initialize Λ0 = 0, l = 1. repeat For each (i, j), set δi,j = arg min (10); Update Λ(l) = Λ(l 1) + ; l = l + 1; until \| \| thr return W (t) = exp(Λ(l))
Open Source Code	No	The paper mentions 'videos' on a 'project page' but does not explicitly provide a link to the open-source code for their methodology. The acknowledgement refers to sharing of data and code from Jonathan Bragg, not the authors' own.
Open Datasets	Yes	We compare our method with the most relevant systems DELUGE [Bragg et al., 2013] and CASCADE [Chilton et al., 2013], which also use crowdsourcing to build hierarchies. ... This dataset has 33 labels that are part of the ﬁne-grained entity tags [Ling and Weild, 2012]. ... Food Item Names This experiment investigates learning a hierarchy over food items used in a robotics setting [Lai et al., 2011a], where the goal is to learn names people use in a natural setting to refer to objects. Here, AMT workers were shown images from the RGBD object dataset [Lai et al., 2011a] and asked to provide names they would use to refer to these objects.
Dataset Splits	No	No explicit statement providing specific dataset split information (exact percentages, sample counts, citations to predefined splits, or detailed splitting methodology) was found.
Hardware Specification	No	No specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running experiments were mentioned.
Software Dependencies	No	The paper mentions using Amazon Mechanical Turk and external code (DELUGE) but does not provide specific ancillary software details (e.g., library or solver names with version numbers) for their own experimental setup.
Experiment Setup	Yes	The number of samples for updating the weight matrix is ﬁxed to 10, 000 across all experiments. Overall, we found that β = 0.01 works robustly across different N and use that value in all the following experiments. We estimate this by gathering answers from 8 workers for each question, then take the majority vote as the answer, and use all answers to determine the noise ratio for that question. 5 different path questions are put into one Human Intelligence Task (HIT). Each HIT costs $0.04.