Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Identification and Estimation of the Bi-Directional MR with Some Invalid Instruments

Authors: Feng Xie, Zhen Yao, Lin Xie, Yan Zeng, Zhi Geng

NeurIPS 2024 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we conduct various simulation studies to evaluate the performance of the proposed PRe Bi M method.
Researcher Affiliation Academia Feng Xie1, Zhen Yao1, Lin Xie1, Yan Zeng1, , Zhi Geng1 1Beijing Technology and Business University
Pseudocode Yes Algorithm 1 PRe Bi M
Open Source Code Yes Our source code can be found in the Supplementary Materials.
Open Datasets Yes We first apply our method to analyze the bi-directional causal relationships between obesity and vitamin D status using the GWAS data from Vimaleswaran et al. [2013]. Next, we apply our method to analyze the causal effect of institutions on economic development using the Colonial Origins dataset from [Acemoglu et al., 2001].
Dataset Splits No T1: Sensitivity to Sample Size. We evaluated the impact of different sample sizes: n = 2k, 5k, and 10k, where k equals 1, 000.
Hardware Specification Yes All experiments were conducted using AMD Ryzen 7 7735H with Radeon Graphics processors, operating at a base speed of 3.20 GHz, and equipped with 16.0 GB (15.2 GB available) of RAM.
Software Dependencies No For sis VIVE algorithm, we used the implementations in the R sis VIVE package, which can be downloaded at https://cran.r-project.org/web/packages/sis VIVE/.
Experiment Setup Yes To compare the performance of these methods in a realistic setting, analogous to Slob and Burgess [2020], the genetic variants are modeled as Single Nucleotide Polymorphisms (SNPs), with a varying minor allele frequency maf j, and take values 0, 1, or 2. The minor allele frequencies are drawn from a uniform distribution. Specifically, the data generation process for the bi-directional model is as follows: U = G γU + ε1, (11) X = Y βY X + G γX + UγX,U + ε2, (12) Y = XβX Y + G γY + UγY,U + ε3, (13) Gij Binomial(2, maf j), maf j U(0.1, 0.5), (14) where the error terms ε1, ε2, ε3 each follow an independent normal distribution with mean 0 and unit variance. The causal effects βY X and βX Y are generated from a uniform distribution between [ 1, 0.5] [0.5, 1].