Multi-objective Deep Data Generation with Correlated Property Control
Authors: Shiyu Wang, Xiaojie Guo, Xuanyang Lin, Bo Pan, Yuanqi Du, Yinkai Wang, Yanfang Ye, Ashley Petersen, Austin Leitgeb, Saleh Alkhalifa, Kevin Minbiole, William M. Wuest, Amarda Shehu, Liang Zhao
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The experiments demonstrate our model s superior performance in generating data with desired properties. The code of Corr VAE is available at https://github.com/shi-yu-wang/Corr VAE. 5 Experiments |
| Researcher Affiliation | Collaboration | 1Emory University, {shiyu.wang, mike.lin, bo.pan, william.wuest, liang.zhao}@emory.edu 2IBM Thomas.J. Watson Research Center, xguo7@gmu.edu 3Cornell University, yd392@cornell.edu 4Tufts University, yinkai.wang@tufts.edu 5University of Notre Dame, yye7@nd.edu 6Villanova University, {apeter24, austin.leitgeb, kevin.minbiole}@villanova.edu 7Recursiv LLC, salehesam@gmail.com 8George Mason University, ashehu@gmu.edu |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code of Corr VAE is available at https://github.com/shi-yu-wang/Corr VAE. |
| Open Datasets | Yes | 1) The Quaternary Ammonium Compound (QAC) dataset is a real dataset that contains 462 quaternary ammonium compounds processed by the Minbiole Research Lab 1. An open-source cheminformatics and machine learning library were used to generate a number of properties or features for each of the compounds, in which molecular weight and the log P value were used as data properties in our experiments; 2) QM9 dataset is an enumeration of 134,000 stable organic molecules with up to 9 heavy atoms [39]; 3) d Sprites contains 737,280 total images regarding 2D shapes procedurally generated from 6 ground truth independent latent factors [33], in which shape, scale, x position and y position were employed in our experiments. To construct correlated properties, we additionally formed and tested a new property, x+y positions by summing up x position with y position; and 4) Pendulum dataset was originally synthesized to explore causality of the model [46]. |
| Dataset Splits | No | The paper mentions 'training set' and 'test set' for various datasets but does not explicitly provide the specific percentages or sample counts for training, validation, and test splits needed for reproduction. It does not state explicit split ratios like '80/10/10'. |
| Hardware Specification | No | The paper does not provide specific details about the hardware used for running the experiments, such as GPU models, CPU types, or memory specifications. |
| Software Dependencies | No | The paper mentions 'An open-source cheminformatics and machine learning library' but does not provide specific software names with version numbers. |
| Experiment Setup | Yes | In Section 4.1.1, the paper mentions 'ρ1 and ρ2 are co-efficient hyper-parameters to penalize the two terms'. In Section 5.4.1, it states 'we train Corr VAE using shape, scale and three correlated properties x position, y position and x+y position while setting the dimension of w as 8'. |