Deep Symbolic Superoptimization Without Human Knowledge
Authors: Hui Shi, Yang Zhang, Xinyun Chen, Yuandong Tian, Jishen Zhao
ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We performed two experiments. The first experiment compares HISS with human-independent naive search algorithms. The second experiment compares HISS with existing human-dependent state-of-the-art systems on benchmark datasets. |
| Researcher Affiliation | Collaboration | UC San Diego1, MIT-IBM Watson AI Lab2, UC Berkeley3, Facebook AI Research4 |
| Pseudocode | No | No pseudocode or algorithm blocks are provided in the paper. |
| Open Source Code | Yes | The code is available at https://github.com/shihui2010/symbolic_simplifier. |
| Open Datasets | Yes | From the remaining expressions, we sample 900 expressions as the training set, 300 as the validation set and 300 as the test set. The Halide dataset (Chen & Tian, 2018) contains around 10,000 training sequences, 1,000 testing sequences, and 1,000 validation sequences, generated and split randomly. |
| Dataset Splits | Yes | From the remaining expressions, we sample 900 expressions as the training set, 300 as the validation set and 300 as the test set. The Halide dataset (Chen & Tian, 2018) contains around 10,000 training sequences, 1,000 testing sequences, and 1,000 validation sequences, generated and split randomly. |
| Hardware Specification | Yes | The stage-two training takes two weeks to train the HISS for full simplification pipeline on RTX 2080 for 40 epochs (400k steps). |
| Software Dependencies | No | The paper mentions using the ADAM optimizer but does not specify version numbers for any software dependencies like programming languages, libraries, or frameworks. |
| Experiment Setup | Yes | The input to the network is one-hot encoded sequences where the vocabulary size is 50, then the input is encoded by a single fully connected layer with output size 32. The hidden units of LSTM are set to 64 for both encoder and decoder, as a common setting adopted in many previous works (Liu & Lee, 2017), and the number of layers is 1. The output size of the encoder is 64, and the output size of the decoder is equal to the vocabulary size (50). The subtree selector consists of two feed-forward layers with output sizes of 128 and 1 respectively. The model is trained with the ADAM optimizer with a learning rate of 1e-3. ... The penalty of not getting an equivalent expression, β (as in Eq. (8)), is set to 0.1. ... In all the experiments, we set beam size k = 20 and s = 20. |