reproducibilityindex.ai

Unveiling the Potential of AI for Nanomaterial Morphology Prediction

Authors: Ivan Dubrovsky, Andrei Dmitrenko, Aleksei Dmitrenko, Nikita Serov, Vladimir Vinogradov

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	This study explores the potential of AI to predict the morphology of nanoparticles within the data availability constraints. For that, we first generated a new multi-modal dataset that is double the size of analogous studies. Then, we systematically evaluated performance of classical machine learning and large language models in prediction of nanomaterial shapes and sizes. Finally, we prototyped a text-to-image system, discussed the obtained empirical results, as well as the limitations and promises of existing approaches.
Researcher Affiliation	Collaboration	1Center for AI in Chemistry, Chem Bio Cluster, ITMO University, St. Petersburg, Russia 2D ONE AG, Zurich, Greater Zurich Area, Switzerland.
Pseudocode	No	The paper includes architectural diagrams (e.g., Figure 2, Figure 5A) but does not present any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps.
Open Source Code	Yes	All datasets, scripts and results described in this work are available for reproducibility and possible transfer learning applications in this repository: https://github.com/acid-design-lab/Nanomaterial_Morphology_Prediction.
Open Datasets	Yes	All datasets, scripts and results described in this work are available for reproducibility and possible transfer learning applications in this repository: https://github.com/acid-design-lab/Nanomaterial_Morphology_Prediction.
Dataset Splits	Yes	Hyperparameter optimization was performed using 5-fold cross-validated grid-search.
Hardware Specification	Yes	CPU AMD Ryzen 7 3700X 3.60 GHz 8-Core Processor GPU NVIDIA Ge Force RTX 3090 24 GB of GPU memory RAM 32.0 GB Operating system Windows 11 Pro N
Software Dependencies	No	The paper mentions 'Python 3.9' and the 'scikit-learn library', but only Python is provided with a specific version number, and other key libraries or models used (e.g., PyTorch Lightning Bolts, BERT) lack explicit version details.
Experiment Setup	Yes	In case of the Random Forest, optimization of the following parameters was performed: n estimators, max features, max depth, min samples leaf, max leaf nodes. In case of Gradient Boosted Trees, the optimized parameters were: gamma, colsample bytree, max depth, n estimators, learning rate. The optimal set of hyperparameters was 128 128 for the image size, 64 for the batch size, 0.001 for the learning rate, and 0.01 for the KL divergence coefficient.