Physics-constrained Automatic Feature Engineering for Predictive Modeling in Materials Science
Authors: Ziyu Xiang, Mingzhou Fan, Guillermo Vázquez Tovar, William Trehern, Byung-Jun Yoon, Xiaofeng Qian, Raymundo Arroyave, Xiaoning Qian10414-10421
AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To evaluate our proposed AFE strategies, we perform experiments with three real-world materials science datasets: one for classification of metal/non-metal materials, one for regression to get alloy elastic behavior based on alloy compositions, and the third dataset for predicting material s phase transition temperature with the physics constraints of feature groups. |
| Researcher Affiliation | Academia | 1Electrical & Computer Engineering, 2Materials Science & Engineering, 3Computer Science & Engineering, Texas A&M University, College Station, Texas 77843, 4Computational Science Initiative, Brookhaven National Laboratory, Upton, NY 11973 |
| Pseudocode | Yes | Algorithm 1 DQN for Automatic Feature Engineering |
| Open Source Code | Yes | Our code is open-source and available at https://github.com/ziyux/AFE. |
| Open Datasets | Yes | The classification problem is based on a dataset of 10 prototype structures (Na Cl, Cs Cl, Zn S, Ca F2, Cr3Si, Si C, Ti O2, Zn O, Fe As, Ni As) with a total number of 260 materials from one of the experiments reported in Ouyang et al. (2018) |
| Dataset Splits | No | first randomly splitting the dataset with 182 materials in the training set and the remaining 78 materials in the test set (7:3). |
| Hardware Specification | Yes | We run all the experiments on the platform with the hardware configuration of Intel Xeon E5-2670, 64GB 1866MHz RAM and 2 NVIDIA k20 GPUs. |
| Software Dependencies | No | The paper describes model architecture and hyperparameters but does not provide specific software dependencies (e.g., library names with version numbers) needed for replication beyond the general statement 'Our code is open-source and available at https://github.com/ziyux/AFE.' |
| Experiment Setup | Yes | For DQN, we have adopted a two-layer Q-network with the corresponding hidden dimensions {150,120} and the relu activation function is used for both layers. The following hyperparameters are set for DQN training: Learning rate: 0.001; Experience replay batch size: 64; Gamma: 0.99; Epsilon: 1.0 (decay 0.99 and min 0.05). |