Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

JADE: Joint Alignment and Deep Embedding for Multi-Slice Spatial Transcriptomics

Authors: Yuanchuan Guo, Jun Liu, Huimin Cheng, Ying Ma

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmark JADE on multi-slice SRT datasets from the human dorsolateral prefrontal cortex (DLPFC) [45] and the regenerating axolotl brain [68], and show that it consistently outperforms state-of-the-art alignment and embedding methods in spatial clustering accuracy, alignment fidelity, and biological interpretability.
Researcher Affiliation	Academia	Yuanchuan Guo Department of Statistics Harvard University EMAIL Jun S. Liu Department of Statistics Tsinghua University EMAIL Huimin Cheng Department of Biostatistics Boston University EMAIL Ying Ma Department of Biostatistics Center for Computational Molecular Biology Brown University EMAIL
Pseudocode	Yes	Detailed descriptions of each module are provided in the following sections and the pseudo code is shown in Appendix A. Algorithm 1: JADE (pairwise training loop) Algorithm 2: Fast-JADE (coarse-to-fine alignment with hyperspots)
Open Source Code	Yes	The implementation of JADE, along with data preprocessing scripts and pretrained models, is publicly available at https://github.com/YMa-lab/JADE.
Open Datasets	Yes	We benchmark JADE on multi-slice SRT datasets from the human dorsolateral prefrontal cortex (DLPFC) [45] and the regenerating axolotl brain [68]... We also tested JADE on two additional datasets in Appendix H: the MERFISH dataset [10] and the Breast-Cancer Visium/Xenium dataset [26], previously used in SLAT [71].
Dataset Splits	No	The paper describes the datasets used (DLPFC, axolotl brain, MERFISH, breast cancer Visium/Xenium) and evaluates performance on whole slices or slice pairs. For example, 'The dataset comprises 12 serial tissue sections, including four sequential slices (A D) from each of three donors (I III)'. It does not specify classic training/test/validation splits for its own model in the typical machine learning sense to evaluate generalization. The method is applied directly to the full datasets/slices.
Hardware Specification	No	Table 8 presents the GPU runtime per epoch for Fast-JADE. As shown in the table, Fast-JADE demonstrates approximately linear scaling with respect to the data size, maintaining efficient performance even on large-scale data. However, no specific GPU model, CPU, or memory details are provided.
Software Dependencies	No	We followed the standardized preprocessing workflow implemented in the SCANPY package [69] to prepare the input data for our model. We normalized the length of each embedding vector and applied the mclust algorithm independently... For each spatial domain identified from the embeddings, we conducted differential expression analysis using the Wilcoxon rank-sum test... The paper mentions several software packages and algorithms but does not provide specific version numbers for them (e.g., 'SCANPY package', 'mclust algorithm', 'UMAP').
Experiment Setup	Yes	For each pairwise alignment task, we first selected the top 3,000 highly variable genes from each slice, then took the intersection of these two sets, yielding approximately 1,500 genes for every pair in DLPFC and 1000 genes for axolotl brain dataset... We employed a GCN with a single hidden layer to project the original expression matrix into a 64-dimensional latent space. Following Xu et al. [76], we set the neighborhood size to k = 3... Throughout, we fixed λ0 4 = 5.0. We used Adam optimizer with learning rate set as 0.002, number of pretraining epochs as be 200 and number of training epochs set as 800. We fixed λ2 = 10 and λ5 = 1 and choose a neighborhood size of k = 3.