Recurrent Quantum Neural Networks

Authors: Johannes Bausch

NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental To study the model s performance, we provide an implementation in pytorch, which allows the relatively efficient optimization of parametrized quantum circuits with tens of thousands of parameters, and which demonstrates that the model does not appear to suffer from the vanishing gradient problem that plagues many existing quantum classifiers and classical RNNs. We establish a QRNN training setup by benchmarking optimization hyperparameters, and analyse suitable network topologies for simple memorisation and sequence prediction tasks from Elman s seminal paper (1990). We then proceed to evaluate the QRNN on MNIST classification, by feeding the QRNN each image pixel-by-pixel; with a network utilizing only 12 qubits we reach a test set accuracy over 95% when discriminating between the digits 0 and 1 .
Researcher Affiliation Academia Johannes Bausch acknowledges support of the Draper s Research Fellowship at Pembroke College.
Pseudocode No The paper includes diagrams of quantum circuits and cell structures (e.g., Figures 1, 2, 3, 4) and describes how the overall procedure is assembled, but it does not provide formal pseudocode or algorithm blocks.
Open Source Code No The paper states, "We implemented the QRNN in pytorch," but it does not provide a link to the source code or an explicit statement about its public availability.
Open Datasets Yes Using the MNIST dataset (55010 : 5000 : 10000 train:validate:test split; images cropped to 20 20 pixels first, then downscaled to size 10 10).
Dataset Splits Yes Using the MNIST dataset (55010 : 5000 : 10000 train:validate:test split; images cropped to 20 20 pixels first, then downscaled to size 10 10).
Hardware Specification No All experiments were executed on 2-8 CPUs, and required between 500MB and 35GB of memory per core. This provides general information about the number of CPUs and memory, but does not specify the exact models or types of CPUs used for reproducibility.
Software Dependencies No We implemented the QRNN in pytorch, using custom quantum gate layers and operations that allow us to extract the predicted distributions at each step. The paper mentions "pytorch" and "autograd framework" but does not specify their version numbers.
Experiment Setup Yes We establish a QRNN training setup by benchmarking optimization hyperparameters, and analyse suitable network topologies for simple memorisation and sequence prediction tasks from Elman s seminal paper (1990). We found the L-BFGS optimizer commonly used with VQE circuits highly numerically unstable, resulting in many runs with Na Ns; we thus excluded it from this experiment. SGD has a very narrow window of good learning rates; RMSprop and Adam are less sensitive to this choice, with Adam generally outperforming the former, which makes it our default choice for all following experiments. Our findings and choices for the default initialisation are collected in fig. 6. The most influential meta parameter is the bias µ = π/2 which, as shown in fig. 5, places the initial polynomial η at the steepest slope of the activation function, which results in a large initial gradient.