Learning to Optimize Variational Quantum Circuits to Solve Combinatorial Problems

Authors: Sami Khairy, Ruslan Shaydulin, Lukasz Cincio, Yuri Alexeev, Prasanna Balaprakash2367-2375

AAAI 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Extensive simulations using the IBM Qiskit Aer quantum circuit simulator demonstrate that our proposed RLand KDE-based approaches reduce the optimality gap by factors up to 30.15 when compared with other commonly used off-the-shelf optimizers.
Researcher Affiliation Collaboration 1Illinois Institute of Technology, 2Clemson University, 3Los Alamos National Laboratory, 4Argonne National Laboratory
Pseudocode No The paper describes algorithms in prose and mathematical formulations but does not include any explicitly labeled pseudocode blocks or algorithm listings.
Open Source Code No The paper does not provide an explicit statement about releasing its source code or a direct link to a code repository for the methodology described.
Open Datasets No To construct GTrain, we choose one representative graph instance of 8 vertices from each class and distribution, amounting to |GTrain| = 7 training instances. Four classes of graphs are considered: (1) Erdos Renyi random graphs GR(n R, ep), where n R is the number of vertices and ep is the edge generation probability, (2) ladder graphs GL(n L), where n L is the length of the ladder, (3) barbell graphs GB(n B), formed by connecting two complete graphs Kn B by an edge, and (4) Caveman graphs GC(n C, nk), where n C is the number of cliques and nk is the size of each clique. While graph types are specified, there is no direct link, DOI, or repository provided for the specific instances generated and used for training.
Dataset Splits No The paper defines GTrain (training instances) and GTest (test instances) but does not specify a separate validation dataset split.
Hardware Specification No The paper mentions 'IBM Qiskit Aer quantum circuit simulator' for simulations and 'Bebop, a high-performance computing cluster operated by the Laboratory Computing Resource Center at Argonne National Laboratory' in the acknowledgments, but does not provide specific details such as CPU/GPU models, memory, or other hardware specifications used for computation.
Software Dependencies No The paper mentions 'IBM Qiskit Aer quantum circuit simulator' and 'NLopt nonlinear optimization package (Johnson 2019)' but does not provide specific version numbers for these software dependencies.
Experiment Setup Yes A training episode is defined to be a trajectory of length T = 64... Training is performed for 750 epochs of 128 episodes each... We train our proposed deep RL framework using the actor-critic Proximal Policy Optimization (PPO) algorithm (Schulman et al. 2017)... Fully connected multilayer perceptron networks with two hidden layers for both the actor (policy) and critic (value) networks are used. Each hidden layer has 64 neurons. Tanh activation units are used in all neurons. The range of output neurons is scaled to [ 0.1, 0.1]. The discount factor and number of history iterations in the state formulation are set to ζ = 0.99 and L = 4, respectively. A Gaussian policy with a constant noise variance of e 6 is adopted throughout the training...