Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Scalable Non-Equivariant 3D Molecule Generation via Rotational Alignment
Authors: Yuhui Ding, Thomas Hofmann
ICML 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | In this section, we present experimental results that empirically validate the effectiveness of our proposed approach. In Section 4.1, we introduce the experimental setup, including the datasets, baselines and implementation details. In Section 4.2 and 4.3, we present the main results on molecule generation benchmarks. In Section 4.4, we show the results of ablation studies. In Section 4.5, we demonstrate the efficiency and scalability of our non-equivariant model. Finally in Section 4.6, we show the conditional generation results. |
| Researcher Affiliation | Academia | 1Department of Computer Science, ETH Zurich. Correspondence to: Yuhui Ding <EMAIL>. |
| Pseudocode | Yes | Algorithm 1 Training algorithm for the autoencoder. Inputs: atomic coordinates x, atom features h Learnable parameters: rotation network Rθ, encoder Eη, decoder Dψ while not converged do Rθ Rθ(x, h) µx, µh Eη(Rθx, h) Subtract center of gravity from µx ϵ = (ϵx, ϵh) N(0, I) Subtract center of gravity from ϵx zx, zh µ + σϵ Calculate the reconstruction loss L(θ, η, ψ) (20) Update θ, η, ψ end while |
| Open Source Code | Yes | Our code is available at https: //github.com/skeletondyh/RADM. |
| Open Datasets | Yes | Datasets We first evaluate our approach using the QM9 dataset (Ramakrishnan et al., 2014) which is a standard molecule generation benchmark widely used by related works. ... Next we evaluate our model on the larger GEOMDrugs dataset (Axelrod & Gomez-Bombarelli, 2022). |
| Dataset Splits | Yes | We split the dataset in the same way as Hoogeboom et al. (2022), with 100K, 18K, 13K samples for the train, validation and test partitions respectively. |
| Hardware Specification | Yes | All the numbers are measured on a single RTX 4090 GPU. |
| Software Dependencies | No | No specific version numbers for software dependencies are provided in the main text. While PyTorch is mentioned, its version is not specified. |
| Experiment Setup | Yes | On QM9, we train the autoencoder for 200 epochs using a batch size of 64. ... We adopt a batch size of 256 as used in the Di T paper, and train both RADMDi T-S and RADMDi T-B for around 5500 epochs. ... We train the autoencoder (and the rotation network) using the Adam optimizer with a learning rate of 1 10 4 and a cosine annealing schedule. The latent diffusion model is also trained using Adam with a learning rate of 1 10 4. ... We use the same hidden dimension and number of layers for the autoencoder as Geo LDM, and the number of layers of the rotation network is 2 on both datasets. As for the diffusion model, we use the same noise schedule and number of time steps as EDM/Geo LDM. |