A Chance-Constrained Generative Framework for Sequence Optimization
Authors: Xianggen Liu, Qiang Liu, Sen Song, Jian Peng
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Extensive experimental results in three domains demonstrate the superiority of our approach over the existing sequence optimization methods. We first evaluate the effectiveness of our framework on various domains, including arithmetic expressions, python programs, and molecules. Experimental results show that CCGF outperforms all the sequence optimization methods among all these domains. |
| Researcher Affiliation | Academia | 1Laboratory for Brain and Intelligence and Department of Biomedical Engineering, Tsinghua University, Beijing, China. 2Department of Computer Science, University of Illinois at Urbana Champaign, IL, USA. 3Department of Computer Science, University of Texas at Austin, TX, USA. |
| Pseudocode | Yes | Algorithm 1 Training of CCGF |
| Open Source Code | No | The paper does not provide a specific link to its source code or explicitly state that its code is open-source or available in supplementary materials. |
| Open Datasets | Yes | We use the ZINC molecular dataset, which contains 250, 000 drug-like molecules (Irwin et al., 2012). We randomly collect 100,000 univariate arithmetic expressions that have at most 15 production rules. We use all of them as the training data of CCGF, and 90% of them are used for the training of MLE and 10% of them for validating of MLE (similar protocols are also applied to the following experiments). We follow Kusner et al. (2017) to collect 130, 000 univariate programs as the training data. |
| Dataset Splits | Yes | We use all of them as the training data of CCGF, and 90% of them are used for the training of MLE and 10% of them for validating of MLE (similar protocols are also applied to the following experiments). |
| Hardware Specification | No | The paper does not specify any particular hardware (e.g., GPU models, CPU models, memory) used for running the experiments. |
| Software Dependencies | No | The paper mentions 'Py Torch library (Paszke et al., 2017)' and 'RDKit (Landrum) library' but does not specify version numbers for these software components. |
| Experiment Setup | Yes | CCGF is implemented based on the Py Torch library (Paszke et al., 2017). The Adam algorithm with a learning rate of r is used to update their parameters. The SGD algorithm with a learning rate of rα is used to update α. We fix the validity threshold T to 0.5 and adjust ϵ for each task... The batch size B is set to 1, 000... We investigate influences of classification threshold Tacc to the performance of CCGF... and finally set Tacc to 0.1. Their selected values for each task are listed in Appendix E. |