DRACO: A Denoising-Reconstruction Autoencoder for Cryo-EM

Authors: YingJun Shen, Haizhao Dai, Qihe Chen, Yan Zeng, Jiakai Zhang, Yuan Pei, Jingyi Yu

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental DRACO demonstrates the best performance in denoising, micrograph curation, and particle picking tasks compared to state-of-the-art baselines.
Researcher Affiliation Collaboration 1School of Information Science and Technology, Shanghai Tech University. 2Cellverse Co, Ltd. 3i Human Institute, Shanghai Tech University.
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code No All code, pre-trained model weights, and datasets will be made publicly available for further research and model development. (Also confirmed by NeurIPS checklist: 'We do not provide code and data in submission. However, the code and data will be released upon acceptance.')
Open Datasets Yes Direct access to the public database leads to varying data quality, inconsistent data formats, or missing annotations. Therefore, we construct a large-scale, high-quality, and diverse single-particle cryo-EM image dataset by curating and manually processing 529 sets of data from EMPIAR [17], obtaining over 270,000 cryo-EM movies or micrographs in total.
Dataset Splits Yes We divided these micrographs into training and evaluation datasets using an 80%/20% split ratio.
Hardware Specification Yes The warm-up stage takes 6 hours, and the pre-training stage takes 16 hours on a GPU cluster with 64 NVIDIA A800 GPUs, requiring approximately 80 GB of memory for a batch size of 4096.
Software Dependencies No The paper mentions software like cryo SPARC and Detectron2, but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The decoder of DRACO uses 8 Transformer blocks with embedding dimension 512, followed by a three-layer convolution neck and a linear projection layer with an output dimension 16 16, which is also the patch size of the input. The mask ratio for the one input micrograph is 0.75 by default. ... We warm up DRACO ... for 200 epochs. Then we adopt our novel denoising-reconstruction pre-training for 400 epochs.