reproducibilityindex.ai

Cell2Sentence: Teaching Large Language Models the Language of Biology

Authors: Daniel Levine, Syed A Rizvi, Sacha Lévy, Nazreen Pallikkavaliyaveetil, David Zhang, Xingyu Chen, Sina Ghadermarzi, Ruiming Wu, Zihe Zheng, Ivan Vrkic, Anna Zhong, Daphne Raskin, Insu Han, Antonio Henrique De Oliveira Fonseca, Josue Ortega Caro, Amin Karbasi, Rahul Madhav Dhodapkar, David Van Dijk

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Our experiments reveal that GPT-2, when fine-tuned with C2S, can generate biologically valid cells based on cell type inputs, and accurately predict cell types from cell sentences.
Researcher Affiliation	Collaboration	1Department of Computer Science, Yale University, New Haven, CT, USA ... 6Google ... 9Roski Eye Institute, University of Southern California, Los Angeles, CA, USA
Pseudocode	No	No structured pseudocode or algorithm blocks were found in the paper.
Open Source Code	No	We plan to open-source our software and cell sentence datasets.
Open Datasets	Yes	We focus our experiments on three datasets with extensive natural language metadata and labels, allowing to leverage the capabilities of base models. Immune tissue (Dom ınguez Conde et al., 2022) ... Cytokine stimulation (Dong et al., 2023) ... Multi-tissue (Megill et al., 2021) ... L1000 (Subramanian et al., 2017) and GTEx (Consortium, 2020)
Dataset Splits	Yes	We hold out 20% of cell sentences for validation (10%) and testing (10%).
Hardware Specification	Yes	Even on a p4d.24xlarge AWS instance with 8 A100 40GB GPUs, half-precision, and flash attention 2, we found it difficult to fit longer sequences without memory issues.
Software Dependencies	No	The paper mentions software tools and libraries like Hugging Face, Scanpy, Pythia-160m, AdamW optimizer, and Flash Attention, but does not provide specific version numbers for these software dependencies.
Experiment Setup	Yes	We employ a learning rate of 6 10 4 with a cosine scheduler and 1% warmup ratio. For the GPT-2 medium model, we accumulate gradients over 16 steps. The effective batch sizes for the small and medium models are of 10 and 48 examples.