Reproducibility Index

Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..

Kinetic Langevin Diffusion for Crystalline Materials Generation

Authors: François R J Cornet, Federico Bergamin, Arghya Bhowmik, Juan Maria Garcia-Lastra, Jes Frellsen, Mikkel N. Schmidt

ICML 2025 | Venue PDF | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We evaluate KLDM on both Crystal Structure Prediction (CSP) and De-novo Generation (DNG) tasks, demonstrating its competitive performance with current state-of-the-art models. (...) 5. Experimental results (...) 5.4. Ablation study
Researcher Affiliation	Academia	1Technical University of Denmark, Kongens Lyngby, Denmark 2Pioneer Center for Artificial Intelligence, Copenhagen, Denmark. Correspondence to: Franc ois Cornet <EMAIL>, Federico Bergamin <EMAIL>.
Pseudocode	Yes	Appendix H. Algorithms Algorithm 1 training targets(f, v, l, a, t): Routine for sampling ft, vt, lt, at from the transition kernels and the corresponding target scores Algorithm 2 Training algorithm Algorithm 3 Sampling algorithm Algorithm 4 Sampling with a single Predictor-Corrector step (PC) similar to Rozet & Louppe (2023, Algorithm 4)
Open Source Code	Yes	We describe the exact experimental setting in Appendix I, and release a public code repository with our implementation of KLDM.
Open Datasets	Yes	We follow previous work (Jiao et al., 2023) and evaluate KLDM across 4 datasets: PEROV-5 (Castelli et al., 2012) including perovskite materials with 5 atoms per unit cell (ABX3), all sharing the same structure but differing in composition; MP-20 including almost all experimentally stable materials from the Materials Project (Jain et al., 2013), with unit cells containing at most 20 atoms; and MPTS-52 also extracted from the Materials Project (Jain et al., 2013), with unit cells containing up to 52 atoms.
Dataset Splits	Yes	We trained all the networks using AdamW with the default PyTorch parameters, without gradient clipping and by performing early stopping based on metrics computed on a subset of the validation set: match-rate for the CSP task and valid structures for the DNG task. (...) For each variant of KLDM, we generated 10000 samples, from which we discarded samples that contained elements not supported by the validation pipeline.
Hardware Specification	Yes	All experiments presented in this paper can be performed on a single GPU. We relied on a GPU cluster with a mix of RTX 3090 and RTX A5000, with 24GB of memory.
Software Dependencies	No	We trained all the networks using AdamW with the default PyTorch parameters, without gradient clipping and by performing early stopping based on metrics computed on a subset of the validation set: match-rate for the CSP task and valid structures for the DNG task.
Experiment Setup	Yes	We kept the drift coefficient γ(t) constant at 1 in all the experiments presented in the paper following Zhu et al. (2024). (...) we kept the time horizon constant at T = 2. (...) Regarding the network parameters, we considered 4 message-passing layers for PEROV-5, while we increased them to 6 for the remaining three datasets. In all the experiments, we considered the hidden dimension to be 512, the time embedding to be a 256-dimensional vector and we used SiLU activation with layer norm.