reproducibilityindex.ai

Generative Hierarchical Materials Search

Authors: Sherry Yang, Simon Batzner, Ruiqi Gao, Muratahan Aykol, Alexander Gaunt, Brendan C McMorrow, Danilo Jimenez Rezende, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk

NeurIPS 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that Gen MS outperforms other alternatives of directly using language models to generate structures both in satisfying user request and in generating low-energy structures. We conﬁrm that Gen MS is able to generate common crystal structures such as double perovskites, or spinels, solely from natural language input, and hence can form the foundation for more complex structure generation in near future.
Researcher Affiliation	Industry	Sherry Yang , Simon Batzner, Ruiqi Gao, Muratahan Aykol, Alexander Gaunt, Brendan Mc Morrow, Danilo Rezende, Dale Schuurmans, Igor Mordatch, Ekin Dogus Cubuk Google DeepMind
Pseudocode	Yes	Algorithm 1 Generative Hierarchical Materials Search
Open Source Code	No	The NeurIPS checklist (Question 5) states: 'Does the paper provide open access to the data and code, with sufﬁcient instructions to faithfully reproduce the main experimental results, as described in supplemental material? Answer: [No] Justiﬁcation: The data and prompts are all public and we have provided the information. We are working on open sourcing the code after going through the internal approval process.'
Open Datasets	Yes	Meanwhile, many crystal databases already feature paired data, Dlo = {zi, xi}n i=1, linking chemical formulae to detailed crystal structures. Given this observation, we propose to factorize the crystal generator as π = πhi πlo, where πhi : G 7 (Z) and πlo : Z 7 (X), so that πhi and πlo can be trained using different datasets Dhi and Dlo. Furthermore, many crystal databases such as the Materials Project [10], ICSD [11], OQMD [12], and NOMAD [25] are mentioned and cited.
Dataset Splits	No	The paper describes training and testing procedures, but does not explicitly specify a validation dataset split (e.g., percentages or counts for a validation set used during training) distinct from the test set.
Hardware Specification	Yes	Training hardware 64 TPU-v4 chips
Software Dependencies	No	The paper mentions software like VASP, pymatgen, and atomate, and specifies the Adam optimizer parameters, but does not provide specific version numbers for these software dependencies or libraries.
Experiment Setup	Yes	Table 8: Hyperparameters for training the diffusion model in Gen MS lists specific values such as Learning rate 5e-5, Batch size 512, Training steps 200000, and Optimizer Adam (β1 = 0.9, β2 = 0.99).