Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

E(3)-equivariant models cannot learn chirality: Field-based molecular generation

Authors: Alexandru Dumitrescu, Dani Korpela, Markus Heinonen, Yogesh Verma, Valerii Iakovlev, Vikas Garg, Harri Lähdesmäki

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	The proposed model captures all molecular geometries including chirality, while still achieving highly competitive performance with E(3)-based methods across standard benchmarking metrics. Code is available at https://dumitrescu-alexandru.github.io/FMG-web/. We report the basic molecule properties and molecule graph quality metrics in Tables 1 and 2, while molecule conformation and conditional generation results can be found in Appendix A.1 and A.2. In Table 2, we see the same effect mentioned in the QM9 experiments, where the method has a worse TVa metric, likely caused by our method favoring other metrics to correct atom counts.
Researcher Affiliation	Collaboration	Department of Computer Science, Aalto University Yai Yai Ltd Correspondence: {alexandru.dumitrescu}@aalto.fi
Pseudocode	Yes	Algorithm 1 Peak extraction Input: fields u RNx Ny Nz, threshold t, q {x; u(x) > t}; Am {} repeat p = q.pop() neigh get neigh(p, t, u) Am.insert(mean(neigh, p)) q.pop(neigh) until q is empty return Am
Open Source Code	Yes	Code is available at https://dumitrescu-alexandru.github.io/FMG-web/.
Open Datasets	Yes	QM9 (Ramakrishnan et al., 2014) is a small molecule dataset, containing 134k molecules... The Geometric Ensemble Of Molecules (GEOM) dataset (Axelrod & G omez-Bombarelli, 2022) contains molecules of up to 181 atoms and 37 million conformations along with their corresponding energies.
Dataset Splits	Yes	We use the splits from (Hoogeboom et al., 2022), with 100k, 18k, and 10k molecules for training, validation, and testing respectively.
Hardware Specification	Yes	For the QM9 experiments, we use four A100 GPUs (40GB memory version) for 180 hours, and 1.2 million iterations (or about 780 epochs). On the GEOM-Drugs dataset, we used the same four A100 GPUs for 330 hours and trained our explicit H GEOM-Drugs model for 1.32 million iterations (6 epochs), and our implicit one for 1.2 million iterations (5.5 epochs).
Software Dependencies	No	The paper mentions 'Adam optimizer' for optimization and 'U-Net architecture' for the denoiser, but it does not specify any software libraries or frameworks with version numbers (e.g., PyTorch 1.9, Python 3.8, TensorFlow 2.x).
Experiment Setup	Yes	We optimize the Lsimple objective using Adam optimizer with 8 10 5 learning rate and (0.9, 0.99) for the Adam s β parameters. The models are trained using batch sizes of 32 and 64, for GEOM-Drugs and QM9 datasets, respectively.