Unveiling the Potential of AI for Nanomaterial Morphology Prediction
Authors: Ivan Dubrovsky, Andrei Dmitrenko, Aleksei Dmitrenko, Nikita Serov, Vladimir Vinogradov
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | This study explores the potential of AI to predict the morphology of nanoparticles within the data availability constraints. For that, we first generated a new multi-modal dataset that is double the size of analogous studies. Then, we systematically evaluated performance of classical machine learning and large language models in prediction of nanomaterial shapes and sizes. Finally, we prototyped a text-to-image system, discussed the obtained empirical results, as well as the limitations and promises of existing approaches. |
| Researcher Affiliation | Collaboration | 1Center for AI in Chemistry, Chem Bio Cluster, ITMO University, St. Petersburg, Russia 2D ONE AG, Zurich, Greater Zurich Area, Switzerland. |
| Pseudocode | No | The paper includes architectural diagrams (e.g., Figure 2, Figure 5A) but does not present any explicitly labeled 'Pseudocode' or 'Algorithm' blocks with structured steps. |
| Open Source Code | Yes | All datasets, scripts and results described in this work are available for reproducibility and possible transfer learning applications in this repository: https://github.com/acid-design-lab/Nanomaterial_Morphology_Prediction. |
| Open Datasets | Yes | All datasets, scripts and results described in this work are available for reproducibility and possible transfer learning applications in this repository: https://github.com/acid-design-lab/Nanomaterial_Morphology_Prediction. |
| Dataset Splits | Yes | Hyperparameter optimization was performed using 5-fold cross-validated grid-search. |
| Hardware Specification | Yes | CPU AMD Ryzen 7 3700X 3.60 GHz 8-Core Processor GPU NVIDIA Ge Force RTX 3090 24 GB of GPU memory RAM 32.0 GB Operating system Windows 11 Pro N |
| Software Dependencies | No | The paper mentions 'Python 3.9' and the 'scikit-learn library', but only Python is provided with a specific version number, and other key libraries or models used (e.g., PyTorch Lightning Bolts, BERT) lack explicit version details. |
| Experiment Setup | Yes | In case of the Random Forest, optimization of the following parameters was performed: n estimators, max features, max depth, min samples leaf, max leaf nodes. In case of Gradient Boosted Trees, the optimized parameters were: gamma, colsample bytree, max depth, n estimators, learning rate. The optimal set of hyperparameters was 128 128 for the image size, 64 for the batch size, 0.001 for the learning rate, and 0.01 for the KL divergence coefficient. |