CryoGEM: Physics-Informed Generative Cryo-Electron Microscopy

Authors: Jiakai Zhang, Qihe Chen, Yan Zeng, Wenyuan Gao, Xuming He, Zhijie Liu, Jingyi Yu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive experiments show that Cryo GEM is capable of generating authentic cryo EM images. The generated dataset can be used as training data for particle picking and pose estimation models, eventually improving the reconstruction resolution.
Researcher Affiliation Collaboration Jiakai Zhang1,2 Qihe Chen1,2 Yan Zeng1,2 Wenyuan Gao1Xuming He1 Zhijie Liu1,3 Jingyi Yu11Shanghai Tech University 2Cellverse 3i Human Institute
Pseudocode No The paper does not contain any pseudocode or algorithm blocks.
Open Source Code No We do not provide open-source code and dataset in this submission. However, we will release our code and dataset upon acceptance.
Open Datasets Yes We source the Proteasome, Ribosome, and Integrin datasets from the Electron Microscopy Public Image Archive (EMPIAR) [19], a global public resource offering raw cryo-EM micrographs and selected particle stacks for constructing high-resolution 3D molecular maps. The Phage MS2 and Human BAF datasets were downloaded from the cryo PPP dataset [11] with provided manual particle picking annotations.
Dataset Splits No The paper specifies training datasets and evaluation datasets, but does not explicitly mention a distinct validation dataset split for the training of its primary model or the fine-tuning of downstream models.
Hardware Specification Yes All experiments are conducted on a single NVIDIA Ge Force RTX 3090 GPU.
Software Dependencies No The paper mentions software tools like Cryo SPARC, Relion, Warp, Topaz, Cycle GAN, CUT, Cycle Diffusion, Ice Breaker, CTFFIND4, Cryo DRGN, Patch GAN, and UNet, but it does not specify version numbers for these software components.
Experiment Setup Yes As a lightweight model, Cryo GEM trains 100 epochs on a single dataset within an hour using less than 10 GB of memory. ... we set the hyper-parameter λ to 10.0 in all our experiments. ... We fix the image resolution to 1024 1024 during training. Guided by the particle-background mask, we evenly sample 256 queries on particles and 256 on the background, ensuring that their corresponding negative samples are located where the labels are opposite. ... we use 10 annotated micrographs to fine-tune the pre-trained Topaz for 20 epochs ... We set B=256 in our experiments. For various baselines, we train the pose estimation module for 200 epochs using only particles and their ground-truth poses.