Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
LLM Meets Diffusion: A Hybrid Framework for Crystal Material Generation
Authors: Subhojyoti Khastagir, KISHALAY DAS, Pawan Goyal, Seung-Cheol Lee, Satadeep Bhattacharjee, niloy ganguly
NeurIPS 2025 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experiments on benchmark datasets demonstrate that Crys LLMGen consistently outperforms state-of-the-art models in both structural and compositional validity. It also generates more stable, unique, and novel crystal structures compared to existing approaches. Crys LLMGen shows strong generative capability under conditional prompts, effectively producing materials aligning with specified atomic compositions and space group constraints. |
| Researcher Affiliation | Academia | 1 Indian Institute of Technology, Kharagpur, India 2 Indo Korea Science and Technology Center, Bangalore, India Correspondence to Kishalay: EMAIL |
| Pseudocode | Yes | Algorithm 1 Sampling Process of Crys LLMGen |
| Open Source Code | Yes | Code is available at https://github.com/kdmsit/crysllmgen |
| Open Datasets | Yes | We use two popular material datasets for this task: Perov-5 [38, 39] and MP-20 [40]. While training all competitive models, we followed the standard dataset split of 60% for training, 20% for validation, and 20% for testing. |
| Dataset Splits | Yes | While training all competitive models, we followed the standard dataset split of 60% for training, 20% for validation, and 20% for testing. |
| Hardware Specification | No | Question: For each experiment, does the paper provide sufficient information on the computer resources (type of compute workers, memory, time of execution) needed to reproduce the experiments? Answer: [Yes] Justification: See section 5 and appendix C. Guidelines: The answer NA means that the paper does not include experiments. The paper should indicate the type of compute workers CPU or GPU, internal cluster, or cloud provider, including relevant memory and storage. The paper should provide the amount of compute required for each of the individual experimental runs as well as estimate the total compute. |
| Software Dependencies | No | We finetune the LLa MA-2 7B model for 1 epoch using the Adam W optimizer implemented via the transformers.Trainer interface. The learning rate is set to 0.0001. For the diffusion component, we use a batch size of 256 and adopt a cosine noise schedule. The model is trained for 1000 diffusion steps and inference is performed using 900 steps. The denoising network is implemented using a 6-layer CSPNet. Optimization is done using the Adam optimizer with a learning rate of 0.001. |
| Experiment Setup | Yes | LLM Component : We finetune the LLa MA-2 7B model for 1 epoch using the Adam W optimizer implemented via the transformers.Trainer interface. The learning rate is set to 0.0001. Diffusion Model : For the diffusion component, we use a batch size of 256 and adopt a cosine noise schedule. The model is trained for 1000 diffusion steps and inference is performed using 900 steps. The denoising network is implemented using a 6-layer CSPNet. Optimization is done using the Adam optimizer with a learning rate of 0.001. |