Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Learning Relative Gene Expression Trends from Pathology Images in Spatial Transcriptomics

Authors: Kazuya Nishimura, Haruka Hirose, Ryoma Bise, Kaito Shiku, Yasuhiro Kojima

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments using synthetic datasets and real datasets demonstrate the effectiveness of the proposed method.
Researcher Affiliation	Academia	1 Laboratory of Computational Life Science, National Cancer Center Japan 2 Department of Advanced Information Technology, Kyushu University, Japan
Pseudocode	No	The paper describes the methodology using mathematical formulations and textual descriptions in sections 3.1 and 3.2, but it does not contain a clearly labeled pseudocode block or algorithm.
Open Source Code	Yes	The code is available at https://github.com/naivete5656/STRank.
Open Datasets	Yes	To evaluate the effectiveness of our proposed method, we performed experiments using seven datasets from the benchmark of the HEST-1k dataset [8]: IDC, PRAD, PAAD, COAD, READ, cc RCC, and IDC-Lymph Node. ... We used Hest 1k [8] with CC BY-NC-SA 4.0 for the real datasets.
Dataset Splits	Yes	For training, 50,000 samples were independently sampled from each patient. The validation and test sets, each consisting of 10,000 samples, are sampled from a uniform distribution over the interval [0,1] in both setups. ... To avoid train/test patient-level data leakage, we used patient-stratified splits and one patient for validation and testing data, respectively, and the other patients were used for training data.
Hardware Specification	Yes	Experiment 1 (Cloud environment) CPU: 16 assigned physical CPU cores Memory: 320 GB. Experiment 2 (Internal desktop environment) CPU: 12th Gen Intel(R) Core(TM) i9-12900KS, Physical Cores: 16 GPU: NVIDIA RTX A6000 Memory: 128 GB
Software Dependencies	No	We implemented our method with Pytorch [20] with modified BSD LICENSE, Pytorch Lightning [6] with Apache-2.0 LICENSE. The paper mentions software names and references their original publications, but does not provide specific version numbers (e.g., PyTorch 1.9, PyTorch Lightning 1.x).
Experiment Setup	Yes	A simple MLP (multi-layer perceptron) with 3 linear layers ([1 128], [128 128], [128 1]) with ReLU was used for the model. The epoch was 2000 using Adam W [13] with a learning rate 1e 3 with mini batch size = 256. For the scheduler, we used Cosine Annealing [12].