Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

CellCLIP - Learning Perturbation Effects in Cell Painting via Text-Guided Contrastive Learning

Authors: MingYu Lu, Ethan Weinberger, Chanwoo Kim, Su-In Lee

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We benchmarked Cell CLIP on profile-perturbation retrieval for unseen compounds [4] and its ability to recover known biological relationships between perturbations [10, 28]. We find that Cell CLIP achieves strong performance compared to the previous state-of-the-art while taking a fraction of the time to train.
Researcher Affiliation	Academia	Mingyu Lu , Ethan Weinberger , Chanwoo Kim, Su-In Lee Paul G. Allen School of Computer Science & Engineering University of Washington EMAIL
Pseudocode	No	No explicit pseudocode or algorithm blocks are provided. The methodology is described through prose and mathematical equations in Section 3.
Open Source Code	Yes	Code for our reproducing our experiments is available at https://github.com/suinleelab/Cell CLIP.
Open Datasets	Yes	For cross-modality retrieval, following Sanchez-Fernandez et al. [38], we utilize a curated version of Bray et al. [4], comprising approximately 284,034 five-channel Cell Painting images... The processed dataset is publicly available at8. For replicate detection & sister perturbation matching, we employ CPJUMP1 [10]... Raw data, relevant metadata, and gene annotation can be found in9. For zero-shot gene-gene relationship recovery, we use Rx Rx3-core [28], a curated subset of Rx Rx3 [15]... Additional details regarding Rx Rx3core can be found in Kraus et al. [28].
Dataset Splits	Yes	We partitioned the dataset by perturbation into train, validation, and test sets with a 70/10/20 split, resulting in 2,115 unseen small molecules in the test set. For each perturbation class, we applied a 70/10/20 split for training, validation, and testing.
Hardware Specification	Yes	All experiments are conducted on systems equipped with 64 CPU cores and the specified NVIDIA GPUs. Models were trained with the largest possible batch size on 8 RTX 6000 GPUs.
Software Dependencies	Yes	This study employs the Py Torch package tutorial (version 2.2.1).
Experiment Setup	Yes	For retrieval evaluation on Bray et al. [4], Cell CLIP was trained for 50 epochs with a batch size of 768 using the Adam W optimizer. The learning rate was set to 2 10 4 with cosine annealing and restarts. The temperature parameter τ was initialized as 14.3. For CP-JUMP1, we reuse the model trained with Bray et al. [4] and further fine-tune with CP-JUMP1. Fine-tuning was performed for an additional 50 epochs using the same hyperparameter settings as in pretraining.