Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SAINT: Sequence-Aware Integration for Spatial Transcriptomics Multi-View Clustering

Authors: Zeyu Zhu, KE LIANG, Lingyuan Meng, Meng Liu, Suyuan Liu, Renxiang Guan, Miaomiao Li, Wanwei Liu, Xinwang Liu

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conduct comprehensive experiments to evaluate the performance and robustness of our SAINT across multiple dimensions, i.e., superiority, effectiveness, transferability, sensitivity and case Study. Speciﬁcally, we aim to answer the following ﬁve questions. Q1: Superiority. Does SAINT outperform existing state-of-the-art models on spatial transcriptomics clustering benchmarks?
Researcher Affiliation	Academia	1National University of Defense Technology, Changsha, China 2Changsha College, Changsha, China
Pseudocode	No	The paper describes its methods using mathematical formulations and descriptive text, but it does not include any clearly labeled 'Pseudocode' or 'Algorithm' blocks or figures.
Open Source Code	No	The authors mention that the source code will be released after the doubleblind review.
Open Datasets	Yes	We conduct experiments on three benchmark datasets commonly used in spatial transcriptomics clustering: LIBD Human Dorsolateral Prefrontal Cortex (DLPFC) Dataset. The DLPFC dataset, curated by the LIBD research group [35]... 10x Visium Human Breast Cancer (HBC) Dataset. The HBC dataset [50]... Mouse Brain Anterior Tissue (MBA) Dataset. The MBA dataset [26]...
Dataset Splits	No	The paper uses various slices/sections from benchmark datasets (DLPFC, HBC, MBA) for evaluation but does not specify explicit training/validation/test splits of a single dataset for model training and evaluation in the traditional machine learning sense. The sections are treated as distinct test cases.
Hardware Specification	Yes	All models are implemented in Py Torch 2.0.1 and trained using the Adam optimizer [22] on a workstation with an Intel Core i9-9900K CPU, 64GB RAM, and an NVIDIA RTX 3090 Ti GPU.
Software Dependencies	Yes	All models are implemented in Py Torch 2.0.1 and trained using the Adam optimizer [22] on a workstation with an Intel Core i9-9900K CPU, 64GB RAM, and an NVIDIA RTX 3090 Ti GPU.
Experiment Setup	Yes	Following MAFN [62], we adopt consistent training settings and learning rate schedules. For sequence embedding, we evaluate two aggregation variants: SAINT-G: gene sequence embeddings are averaged without expression weighting. SAINT-SA: expression-aware attention pooling is applied to gene embeddings per spot. The projection dimensions d1 (for SAINT-G) and d2 (for SAINT-SA) are selected from {16, 32, 64, 128, 256}. ...L = LZINB + α LReg + γ LDICR, where α and γ are hyper-parameters balancing the topology regularization and cross-modal decorrelation objectives. For consistency and fair comparison, we adopt the same hyperparameter conﬁguration as MAFN [62], using its default values across all experiments without further tuning.