reproducibilityindex.ai

Learning Heuristics for Quantified Boolean Formulas through Reinforcement Learning

Authors: Gil Lederman, Markus Rabe, Sanjit Seshia, Edward A. Lee

ICLR 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We conducted several experiments to examine whether we can improve the heuristics of the logic solver CADET through our deep reinforcement learning approach.
Researcher Affiliation	Collaboration	Gil Lederman Electrical Engineering and Computer Sciences University of California at Berkeley gilled@eecs.berkeley.edu Markus N. Rabe Google Research mrabe@google.com Edward A. Lee Electrical Engineering and Computer Sciences University of California at Berkeley eal@eecs.berkeley.edu Sanjit A. Seshia Electrical Engineering and Computer Sciences University of California at Berkeley sseshia@eecs.berkeley.edu
Pseudocode	No	Not found. The paper describes algorithms and network architectures but does not include explicit pseudocode or algorithm blocks.
Open Source Code	Yes	2We provide the code and data of our experiments at https://github.com/lederg/learningqbf.
Open Datasets	Yes	We consider a set of formulas representing the search for reductions between collections of ﬁrst-order formulas generated by Jordan & Kaiser (2013), which we call Reductions in the following. We additionally consider the 2QBF evaluation set of the annual competition of QBF solvers, QBFEVAL (Pulina, 2016).
Dataset Splits	Yes	We ﬁltered out 2573 formulas that are solved without any heuristic decisions. In order to enable us to answer question 2 (see above), we further set aside a test set of 200 formulas, leaving us with a training set of 1835 formulas.
Hardware Specification	Yes	We trained a model on the reduction problems training set for 10M steps on an AWS server of type C5.
Software Dependencies	No	Not found. The paper mentions the CADET solver and uses terms like GNN, but does not specify software dependencies with version numbers.
Experiment Setup	Yes	The hyperparameters that resulted in the best model are δL = 16, δC = 64, and τ = 1, leading to a model with merely 8353 parameters. Literal embedding dimension: δL = 16 Clause embedding dimension: δC = 64 Learning rate: 0.0006 for the ﬁrst 2m steps, then 0.0001 Discount factor: γ = 0.99 Gradient clipping: 2 Number of iterations (size of graph convolution): 1 Minimal number of timesteps per batch: 1200