Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Trivialized Momentum Facilitates Diffusion Generative Modeling on Lie Groups

Authors: Yuchen Zhu, Tianrong Chen, Lingkai Kong, Evangelos Theodorou, Molei Tao

ICLR 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	3 EXPERIMENTAL RESULTS. We will demonstrate accurate generative modeling of Lie group data corresponding to 1) complicated and/or high-dim distribution on torus, 2) protein and RNA structures, 3) sophisticated synthetic datasets on possibly high-dim Special Orthogonal Group, and 4) an ensemble of quantum systems... The resulting method achieves state-of-the-art performance on protein and RNA torsion angle generation and sophisticated torus datasets. We also, arguably for the first time, tackle the generation of data on high-dimensional Special Orthogonal and Unitary groups, the latter essential for quantum problems. ... We outperform baselines by a large margin on protein/RNA torsion angle datasets.
Researcher Affiliation	Academia	Yuchen Zhu , Tianrong Chen , Lingkai Kong, Evangelos A. Theodorou, Molei Tao Georgia Institute of Technology EMAIL
Pseudocode	Yes	Algorithm 1 TDM (Trivialized Diffusion Model)... Algorithm 2 Forward Operator Splitting Integration (FOSI)... Algorithm 3 Backward Operator Splitting Integration (BSOI)... Algorithm 4 Probability Flow ODE
Open Source Code	Yes	Code is available at https://github.com/yuchen-zhu-zyc/TDM.
Open Datasets	Yes	Protein and RNA Torsion Angles: We access the dataset prepared by Huang et al. (Huang et al., 2022) from the repository of (Chen & Lipman, 2024).
Dataset Splits	Yes	All datasets were meticulously partitioned into training and testing sets using a 9:1 ratio.
Hardware Specification	Yes	Hardware: All the experiments are running on one RTX TITAN, one RTX 3090 and one 4090.
Software Dependencies	No	The paper mentions using 'Adam W optimizer' but does not specify software dependencies like programming languages or libraries with version numbers.
Experiment Setup	Yes	Throughout our experiments, we maintained the diffusion coefficient γ(t) constant at 1, while the total time horizon T varied depending on the task, with a good choice ranging from T = 5 to T = 15. We use Adam W optimizer to train the neural networks with an initial learning rate of 5 10 4 with a cosine annealing learning rate scheduler. We train for at most 200k iterations with a batch size of 1024 for each task, and we observe that the model usually converges within 100k iterations. For low dimensional experiments such as Torus, SO(3), we set D = 256. For other experiments, we set D = 512. We choose varied k based on problem difficulties, ranging from k = 3 to k = 5. We use Si LU as the activation function for all the MLPs used in the neural network.