Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Tabula: A Tabular Self-Supervised Foundation Model for Single-Cell Transcriptomics

Authors: Jiayuan Ding, Jianhui Lin, Shiyu Jiang, Yixin Wang, Ziyang Miao, Zhaoyu Fang, Jiliang Tang, Min Li, Xiaojie Qiu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Extensive experiments demonstrate the effectiveness of TABULA: despite using only half the pretraining data, TABULA achieves state-of-the-art performance across key tasks, including gene imputation, perturbation prediction, cell type annotation, and multi-omics integration.
Researcher Affiliation	Academia	Jiayuan Ding* Stanford University Jianhui Lin* Central South University Shiyu Jiang* University of Southern California Yixin Wang Stanford University Ziyang Miao Central South University Zhaoyu Fang Central South University Jiliang Tang Michigan State University Min Li Central South University Xiaojie Qiu Stanford University
Pseudocode	No	The paper describes methods through narrative text and figures (e.g., Figure 2 for an overview of the TABULA framework) but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	All resources are openly available at https://github.com/aristoteleo/tabula to support broad community adoption and future methodological advances.
Open Datasets	Yes	The pretraining dataset includes 1M cells from CELLx GENE (250K per tissue: pancreas, blood, brain, lung). [...] cell type annotation (h Pancreas dataset [13]) and genetic perturbation prediction (Adamson [14] and Norman [15] datasets) [...]. We evaluate genetic perturbation prediction on three benchmark datasets: Adamson [14], Norman [15], and Replogle [17].
Dataset Splits	No	The paper mentions distributing data across clients and testing on datasets but does not explicitly provide specific train/validation/test split percentages, sample counts, or detailed splitting methodologies for its experiments in the main text.
Hardware Specification	No	The main text does not explicitly provide specific hardware details (exact GPU/CPU models, processor types, or memory amounts) used for running its experiments. Appendix C is mentioned for computer resources, but its content is not provided in the main text.
Software Dependencies	No	The paper does not provide specific ancillary software details, such as library names with version numbers, in the main text.
Experiment Setup	Yes	We set the corrupted ratio at 60%.; where the scaling factor α is used to balance the two loss terms to ensure comparable magnitudes. In this study, it is set to 0.03.; MLM uses a 15% masking ratio.; We select 1,200 HVGs per study.