Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].

Scalable Non-Equivariant 3D Molecule Generation via Rotational Alignment

Authors: Yuhui Ding, Thomas Hofmann

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental In this section, we present experimental results that empirically validate the effectiveness of our proposed approach. In Section 4.1, we introduce the experimental setup, including the datasets, baselines and implementation details. In Section 4.2 and 4.3, we present the main results on molecule generation benchmarks. In Section 4.4, we show the results of ablation studies. In Section 4.5, we demonstrate the efficiency and scalability of our non-equivariant model. Finally in Section 4.6, we show the conditional generation results.
Researcher Affiliation Academia 1Department of Computer Science, ETH Zurich. Correspondence to: Yuhui Ding <EMAIL>.
Pseudocode Yes Algorithm 1 Training algorithm for the autoencoder. Inputs: atomic coordinates x, atom features h Learnable parameters: rotation network Rθ, encoder Eη, decoder Dψ while not converged do Rθ Rθ(x, h) µx, µh Eη(Rθx, h) Subtract center of gravity from µx ϵ = (ϵx, ϵh) N(0, I) Subtract center of gravity from ϵx zx, zh µ + σϵ Calculate the reconstruction loss L(θ, η, ψ) (20) Update θ, η, ψ end while
Open Source Code Yes Our code is available at https: //github.com/skeletondyh/RADM.
Open Datasets Yes Datasets We first evaluate our approach using the QM9 dataset (Ramakrishnan et al., 2014) which is a standard molecule generation benchmark widely used by related works. ... Next we evaluate our model on the larger GEOMDrugs dataset (Axelrod & Gomez-Bombarelli, 2022).
Dataset Splits Yes We split the dataset in the same way as Hoogeboom et al. (2022), with 100K, 18K, 13K samples for the train, validation and test partitions respectively.
Hardware Specification Yes All the numbers are measured on a single RTX 4090 GPU.
Software Dependencies No No specific version numbers for software dependencies are provided in the main text. While PyTorch is mentioned, its version is not specified.
Experiment Setup Yes On QM9, we train the autoencoder for 200 epochs using a batch size of 64. ... We adopt a batch size of 256 as used in the Di T paper, and train both RADMDi T-S and RADMDi T-B for around 5500 epochs. ... We train the autoencoder (and the rotation network) using the Adam optimizer with a learning rate of 1 10 4 and a cosine annealing schedule. The latent diffusion model is also trained using Adam with a learning rate of 1 10 4. ... We use the same hidden dimension and number of layers for the autoencoder as Geo LDM, and the number of layers of the rotation network is 2 on both datasets. As for the diffusion model, we use the same noise schedule and number of time steps as EDM/Geo LDM.