Swallowing the Bitter Pill: Simplified Scalable Conformer Generation
Authors: Yuyang Wang, Ahmed A. A. Elhag, Navdeep Jaitly, Joshua M. Susskind, Miguel Ángel Bautista
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental results show that scaling up the model capacity leads to large gains in generalization performance without enforcing inductive biases like rotational equivariance. MCF represents an advance in extending diffusion models to handle complex scientific problems in a conceptually simple, scalable and effective manner. (Abstract) and Experiments on recent conformer generation benchmarks show MCF surpasses strong baselines by a gap that gets larger as we scale model capacity (Introduction) |
| Researcher Affiliation | Industry | Yuyang Wang 1 Ahmed A. Elhag 1 2 Navdeep Jaitly 1 Joshua M. Susskind 1 Miguel Angel Bautista 1 1Apple 2Work was completed while A.A.E was an intern with Apple. Correspondence to: {yuyang wang4, aa elhag, njaitly, jsusskind, mbautistamartin}@apple.com. |
| Pseudocode | Yes | Algorithm 1 Training and Algorithm 2 Sampling (on page 4). |
| Open Source Code | No | The paper does not contain an explicit statement that the authors are releasing the code for their work described in this paper, nor a direct link to their source code repository. |
| Open Datasets | Yes | We use two popular datasets: GEOM-QM9 and GEOM-DRUGS (Axelrod & Gomez-Bombarelli, 2022). and Axelrod, S. and Gomez-Bombarelli, R. Geom, energyannotated molecular conformations for property prediction and molecular generation. Scientific Data, 9(1):185, 2022. (References section). |
| Dataset Splits | Yes | In our experiments, we split GEOM-QM9 and GEOM-DRUGS randomly based on molecules into train/validation/test (80%/10%/10%). At the end, for each dataset, we report the performance on 1000 test molecules. Thus, the splits contain 106586/13323/1000 and 243473/30433/1000 molecules for GEOM-QM9 and GEOM-DRUGS, respectively. (Appendix A.2) |
| Hardware Specification | Yes | For GEOM-QM9, we train models using a machine with 4 Nvidia A100 GPUs using precision BF16. For GEOM-DRUGS, we train models using precision FP32, where MCF-B is trained with 8 Nvidia A100 GPUs and MCF-L is trained with 16 Nvidia A100 GPUs. (Appendix A.2.3) |
| Software Dependencies | No | The paper mentions software components and tools like Perceiver IO and xTB but does not provide specific version numbers for these or other key software libraries like Python, PyTorch, or CUDA, which are necessary for full reproducibility. |
| Experiment Setup | Yes | An Adam W (Loshchilov & Hutter, 2017) optimizer is employed during training with a learning rate of 1e 4. Cosine learning rate decay is deployed with 30K warmup steps. We use EMA with a decay of 0.999. Models are trained for 300K steps on GEOM-QM9 and 750K steps on GEOM-DRUSG. All models use an effective batch size of 512. (Appendix A.2.1) and Table 4. Hyperparameters and settings for MCF on different datasets. (Table 4 caption). |