reproducibilityindex.ai

A Novice-Reviewer Experiment to Address Scarcity of Qualified Reviewers in Large Conferences

Authors: Ivan Stelmakh, Nihar B. Shah, Aarti Singh, Hal Daumé III4785-4793

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In this work, we consider the problem of reviewer recruiting with a focus on the scarcity of qualiﬁed reviewers in large conferences. Speciﬁcally, we design a procedure for (i) recruiting reviewers from the population not typically covered by major conferences and (ii) guiding them through the reviewing pipeline. In conjunction with the ICML 2020 a large, top-tier machine learning conference we recruit a small set of reviewers through our procedure and compare their performance with the general population of ICML reviewers. Our experiment reveals that a combination of the recruiting and guiding mechanisms allows for a principled enhancement of the reviewer pool and results in reviews of superior quality compared to the conventional pool of reviews as evaluated by senior members of the program committee (meta-reviewers).
Researcher Affiliation	Collaboration	Ivan Stelmakh1, Nihar B. Shah1, Aarti Singh1, Hal Daum e III2,3 1School of Computer Science, Carnegie Mellon University 2University of Maryland, College Park 3Microsoft Research, New York
Pseudocode	No	The paper does not contain any pseudocode or algorithm blocks.
Open Source Code	No	The paper does not provide concrete access to source code for the methodology described in this paper.
Open Datasets	No	We solicited 19 anonymized preprints in various sub-areas of ML from colleagues at various research labs, ensuring that authors of these papers do not participate in the experiment as subjects. ... The final pool of papers consisted of working papers, papers under review at other conferences, workshop publications and unpublished manuscripts. The papers were 6-12 pages long excluding references and appendices (a standard range for many ML conferences) and were formatted in various popular journals and conferences templates with all explicit venue identifiers removed. While the papers are described, no public access information (link, DOI, specific repository, or citation to an established public dataset) is provided for this set of 19 papers.
Dataset Splits	No	The paper describes selection and evaluation processes for human reviewers but does not specify training, validation, or test splits in the context of machine learning datasets or models, which is what this question refers to.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper does not provide specific ancillary software details with version numbers (e.g., library or solver names with version numbers) needed to replicate the experiment.
Experiment Setup	Yes	The high-level idea of our selection mechanism is to pretest abilities of candidates to write high-quality reviews. To this end, we frame the experiment as an auxiliary peer-review process that mimics the pipeline of the real ML conferences as explained below and ask participants to serve as reviewers for this conference. ... We gave participants 15 days to complete the review and then extended the deadline for 16 more days... we eventually invited 52 participants whose reviews received excellent feedback... Throughout the conference review process, the EXPERIMENTAL reviewers were offered additional mentorship...