Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

SUICA: Learning Super-high Dimensional Sparse Implicit Neural Representations for Spatial Transcriptomics

Authors: Qingtian Zhu, Yumin Zheng, Yuling Sang, Yifan Zhan, Ziyan Zhu, Jun Ding, Yinqiang Zheng

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	By extensive experiments of a wide range of common ST platforms under varying degradations, SUICA outperforms both conventional INR variants and SOTA methods regarding numerical fidelity, statistical correlation, and bio-conservation.
Researcher Affiliation	Academia	1The University of Tokyo 2Mc Gill University 3MUHC Research Institute 4Duke-NUS Medical School 5Carnegie Mellon University 6Mila Quebec AI Institute. Correspondence to: Jun Ding <EMAIL>, Yinqiang Zheng <EMAIL>.
Pseudocode	No	The paper describes the overall pipeline in Figure 2 and in the '3.2. Method' section, but does not provide any explicit pseudocode or algorithm blocks.
Open Source Code	Yes	The code is available at https://github.com/Szym29/SUICA.
Open Datasets	Yes	For a quantitative benchmarking, we involve a nanoscale resolution Stereo-seq (Spa Tial Enhanced REsolution Omicssequencing) dataset, MOSTA (Chen et al., 2022a). MOSTA consists of a total of 53 sagittal sections from C57BL/6 mouse embryos at 8 progressive stages using Stereo-seq, from which we take 1 slice for each stage (from E9.5 to E16.5) for benchmarking. In addition to Stereo-seq, we also leverage ST data by other common platforms, i.e., Slide-seq V2, 10x Genomics Visium (see Appendix) and MERFISH (see Appendix), to further demonstrate the generalization of SUICA.
Dataset Splits	Yes	for spatial imputation, we randomly sample 80% of the spots for training, leaving the rest 20% for evaluation; for gene imputation, we randomly mute 70% of the elements in the data matrices; for denoising, a standard Gaussian noise is injected to the raw data.
Hardware Specification	Yes	All experiments can be conducted on 1 NVIDIA RTX 4090.
Software Dependencies	No	We implement SUICA with Py Torch. For the implementation, we re-compile the library of tiny-cuda-nn to support float32, to align with other methods. However, no specific version numbers are provided for PyTorch or tiny-cuda-nn.
Experiment Setup	Yes	During the pre-training phase of GAE, we use Adam with a learning rate of 1e 5 for 200 epochs. After obtaining low-dimensional cell embeddings, we train the INR with Adam at a learning rate of 1e 4 for 1k epochs to learn the embedding mapping. Subsequently, the INR is frozen, with the pre-trained decoder trained for an additional 1k epochs using Adam with the same learning rate. To construct the k NN graph for GCN, we set k = 5, including the given cell itself.