reproducibilityindex.ai

Eliciting Honest Information from Authors Using Sequential Review

Authors: Yichi Zhang, Grant Schoenebeck, Weijie Su

AAAI 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	To evaluate the performance of the sequential review mechanism, we use the parallel review mechanism as the baseline which unconditionally reviews all papers. We further use the isotonic mechanism with oracle access to the true ranking information as an upper bound.3 Our simulation results suggest that compared with the baseline, the sequential review mechanism can improve the conference utility towards the upper bound by over 40% when the author submits more than three papers. This effect is even more significant when 1) each author has more papers, 2) papers are more likely to be of low quality and 3) reviewers are more noisy. Moreover, we empirically investigate the number of reviews that a sequential review mechanism can save while achieving the same conference utility as the parallel review mechanism. We employ the ICLR Open Review datasets spanning recent years and develop a more realistic review model. Our results indicate that about 20% of the review burden can be saved when utilizing the sequential review mechanism.
Researcher Affiliation	Academia	Yichi Zhang1, Grant Schoenebeck1, Weijie Su2 1School of Information, University of Michigan 2The Department of Computer and Information Science, University of Pennsylvania yichiz@umich.edu, schoeneb@umich.edu, suw@wharton.upenn.edu
Pseudocode	No	The paper describes mechanisms textually and mathematically but does not include any pseudocode or clearly labeled algorithm blocks.
Open Source Code	No	The paper does not provide any concrete access information (e.g., a specific link to a repository or an explicit statement about code release) for the methodology described.
Open Datasets	Yes	We employ the ICLR Open Review datasets spanning recent years and develop a more realistic review model.
Dataset Splits	No	The paper describes its experimental setup including synthetic data generation and optimization, but it does not specify exact training, validation, or test dataset splits (e.g., percentages or counts) or refer to standard predefined splits for reproducibility.
Hardware Specification	No	The paper describes experiments and simulations but does not specify any hardware details (e.g., CPU, GPU models, memory) used for running them.
Software Dependencies	No	The paper mentions methods like Monte-Carlo and stochastic gradient descent but does not specify any software or library names with version numbers.
Experiment Setup	Yes	Model and Experiment Setup Now, we introduce the parametric setting that is used to generate synthetic data for our experiments. First, suppose the author draws n i.i.d. paper qualities from N(µq, σq). Then, the conference observes the true ranking of these samples. Let q be the ordered vector of paper qualities from high to low. Next, the conference draws an i.i.d. review noise ϵi N(0, σr) for each i [n]. Finally, review scores are observed: ri = qi+ϵi. The parameters (n, µq, σq, σr) defines a Gaussian review model φg. Given φg and a mechanism M, we use the Monte-Carlo method with 10,000 samples of q to estimate the expected conference utility. We refer the details of this estimation to the full version. For each parameter setting, we further optimize the threshold(s) of the three types of mechanisms using stochastic gradient descent.