Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
xTrimoGene: An Efficient and Scalable Representation Learner for Single-Cell RNA-Seq Data
Authors: Jing Gong, Minsheng Hao, Xingyi Cheng, Xin Zeng, Chiming Liu, Jianzhu Ma, Xuegong Zhang, Taifeng Wang, Le Song
NeurIPS 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments also show that the performance of x Trimo Gene improves as we scale up the model sizes, and it also leads to SOTA performance over various downstream tasks, such as cell type annotation, perturb-seq effect prediction, and drug combination prediction. |
| Researcher Affiliation | Collaboration | Bio Map Research 2 Tsinghua University 3 Mohamed bin Zayed University of Artificial Intelligence |
| Pseudocode | No | The paper includes Figure 1 which illustrates the x Trimo Gene Framework, but this is a conceptual diagram and not structured pseudocode or an algorithm block. |
| Open Source Code | No | x Trimo Gene model is now available for use as a service via the following link: https://api.biomap.com/xTrimoGene/apply. This indicates the model is available as an API service, but does not provide access to its source code. |
| Open Datasets | Yes | We evaluated x Trimo Gene s performance on cell type annotation task with Zheng68K [39] and Segerstolpe [31] dataset, which has been widely benchmarked. |
| Dataset Splits | No | The paper mentions using benchmarked datasets and referring to Appendix 1 for data description and Appendix Table 2 for hyperparameter settings, but it does not explicitly provide specific training, validation, or test split percentages or sample counts in the main text. |
| Hardware Specification | Yes | The memory consumption for inference with the x Trimo Gene-100M model is approximately 50GB, whose hardware requirement (Nvidia A100 80G GPU) is beyond some academic labs... |
| Software Dependencies | No | The paper does not explicitly list any software components with their specific version numbers (e.g., Python 3.x, PyTorch 1.x, CUDA 11.x) that would be needed to replicate the experiments. |
| Experiment Setup | Yes | To test the scale-up ability of x Trimo Gene, we pre-trained three models across multiple compute regions and scales (e.g., from 3M to 100M parameters). The detailed hyperparameter setting is displayed in the App. Table 2. |