reproducibilityindex.ai

Learning Over Molecular Conformer Ensembles: Datasets and Benchmarks

Authors: Yanqiao Zhu, Jeehyun Hwang, Keir Adams, Zhen Liu, Bozhao Nan, Brock Stenfors, Yuanqi Du, Jatin Chauhan, Olaf Wiest, Olexandr Isayev, Connor W. Coley, Yizhou Sun, Wei Wang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	In addition, we conduct a comprehensive empirical study, which benchmarks representative 1D, 2D, and 3D MRL models, along with two strategies that explicitly incorporate conformer ensembles into 3D models. Our findings reveal that direct learning from an accessible conformer space can improve performance on a variety of tasks and models. Our experimental results confirm the potential effectiveness of incorporating conformer ensembles in MRL, highlighting the improvements over conventional single-conformation 3D networks.
Researcher Affiliation	Academia	t UCLA MIT CMU l Notre Dame [ Cornell
Pseudocode	No	The paper does not contain any explicitly labeled "Pseudocode" or "Algorithm" blocks.
Open Source Code	Yes	Project homepage: https://github.com/SXKDZ/MARCEL Detailed information regarding dataset access, data formatting, and loading procedures can be found at our Git Hub repository https://github.com/SXKDZ/MARCEL.
Open Datasets	Yes	Detailed information regarding dataset access, data formatting, and loading procedures can be found at our Git Hub repository https://github.com/SXKDZ/MARCEL. Our Drugs-75K can be accessed at https://github.com/SXKDZ/MARCEL/tree/main/datasets/Drugs. As for the conformer ensembles and descriptors that we generated, they are licensed under the Apache License.
Dataset Splits	Yes	Each dataset is partitioned randomly into three subsets: 70% for training, 10% for validation, and 20% for test.
Hardware Specification	Yes	Most of the experiments are conducted on servers equipped with Nvidia A100 GPUs, each with 40GB of memory. For memory-intensive models such as Gem Net and LEFTNet, we use servers with Nvidia H100 GPUs, each with 80GB memory.
Software Dependencies	No	The paper mentions using "Py Torch [60] and Py Torch-Geometric [61] to implement all deep learning models" but does not specify version numbers for these software components.
Experiment Setup	Yes	Each model is trained over 2,000 epochs using the Adam optimizer [55] with early stopping triggered if there is no improvement on the training loss over 200 epochs. ... To ensure a fair comparison, the hidden dimension size is uniformly set to 128 for all models.