Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Diffusion Generative Modeling on Lie Group Representations

Authors: Marco Bertolini, Tuan Anh Le, Djork-Arné Clevert

NeurIPS 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We validate our approach through experiments on diverse data types, demonstrating its effectiveness in real-world applications such as SO(3)-guided molecular conformer generation and modeling ligand-specific global SE(3) transformations for molecular docking, showing improvement in comparison to Riemannian diffusion on the group itself.
Researcher Affiliation	Industry	Marco Bertolini , Tuan Le & Djork-Arné Clevert Machine Learning Research Pfizer Worldwide Research and Development Friedrichstraße 110, 10117 Berlin, Germany EMAIL
Pseudocode	Yes	We describe training and sampling procedures in Algorithms 1 and 2 in Appendix E.
Open Source Code	Yes	Code Availability Our source code will be made available on https://github.com/pfizer-opensource/symmetry-inducedscore-matching.
Open Datasets	Yes	QM9 dataset (Ramakrishnan et al., 2014). We only keep the lowest energy conformer as provided in the original dataset
Dataset Splits	Yes	The trained classifier achieves greater than 99% accuracy on the MNIST test set, providing a reliable metric for evaluating reconstruction quality. ... The model is trained using Adam optimizer (lr=0.001), crossentropy loss, batch size 64, for 10 epochs on the standard MNIST training set (60,000 samples).
Hardware Specification	No	No specific hardware details (GPU/CPU models, processor types, memory amounts, or cloud instance types) are provided in the paper.
Software Dependencies	No	The paper mentions 'RDKit' and 'Python' but does not specify their version numbers or any other software dependencies with version numbers.
Experiment Setup	Yes	We trained the model with T = 100 time-steps, but for sampling it suffices to set T = 10. ... We use L = 5 message passing layers with sdim = 128 , vdim = 64 scalar and vector features, respectively. ... We use the cosine scheduler proposed by Dhariwal & Nichol (2021) and T = 100 diffusion timesteps.