reproducibilityindex.ai

Estimating the Number and Effect Sizes of Non-null Hypotheses

Authors: Jennifer Brennan, Ramya Korlakai Vinayak, Kevin Jamieson

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate our estimator on both real and simulated data. We begin with the mixture of two Gaussians described by Eqn (5). Figure 3 shows the rate of convergence of our estimator for different values of γ , the alternate effect size. Note that the estimate never exceeds the true value ζ , and that it improves as n increases. The variance of our estimator, shown with bootstrapped 90% conﬁdence intervals, can be large for small n but decreases as n increases.
Researcher Affiliation	Academia	1Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA. Correspondence to: Jennifer Brennan <jrb@cs.washington.edu>.
Pseudocode	No	The paper describes the intuition and properties of its estimator and states it "can be implemented as an efficient convex program" and "solved using off-the-shelf software (see Appendix C for details)", but it does not provide a formal pseudocode block or algorithm steps within the main text or appendix.
Open Source Code	Yes	A Python implementation is available at https://github.com/jenniferbrennan/Counting Discoveries/.
Open Datasets	Yes	We evaluated our estimator on Z-scores from an experiment to identify which genes contribute to influenza replication in Drosophila, described by Hao et al. (2008).
Dataset Splits	No	The paper does not provide explicit details on dataset splits (e.g., specific percentages or counts for training, validation, or testing sets). For the real data, it mentions "two replicates" for the Drosophila genes but not how the data was partitioned for model development or evaluation.
Hardware Specification	No	The paper does not provide specific details about the hardware used to run the experiments, such as GPU models, CPU specifications, or cloud computing instance types.
Software Dependencies	No	The paper mentions a "Python implementation" and refers to "CVXPY" and "SCS" in Appendix C as tools used for convex optimization, but it does not specify version numbers for these or any other software dependencies. Therefore, it does not provide a reproducible description including specific version numbers.
Experiment Setup	Yes	For simulation experiments, it states: "After observing Xi N(µi, 1) for i = 1, . . . , n with n = 104" and "For a fixed value of n = 104, we are interested in the probability...". For real data, it notes: "The data... consisted of Z-scores from two replicates for each of 13,071 genes." and "We found that σ2 = 1/4 provided a good fit to the data; we used this value for the rest of our computations." These details describe the configuration for experiments.