Learning Unitary Operators with Help From u(n)

Authors: Stephanie Hyland, Gunnar RŠtsch

AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate the effectiveness of this parametrization on the problem of learning arbitrary unitary operators, comparing to several baselines and outperforming a recently-proposed lower-dimensional parametrization. We additionally use our parametrization to generalize a recently-proposed unitary recurrent neural network to arbitrary unitary matrices, using it to solve standard long-memory tasks.
Researcher Affiliation Academia Stephanie L. Hyland,1, 2 Gunnar R atsch1 1Department of Computer Science, ETH Zurich, Switzerland 2Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medical, New York
Pseudocode No The paper describes derivations and processes using mathematical equations and descriptive text, but it does not include any explicitly labeled pseudocode or algorithm blocks.
Open Source Code Yes The code is available here: https://github.com/ratschlab/u RNN.
Open Datasets Yes We create a n n unitary matrix U (the next section describes how this is done), then sample vectors x Cn with normally-distributed coefficients. We create yj = Uxj +ϵj where ϵ N(0, σ2). The objective is to recover U from the {xj, yj} pairs... The adding problem and the memory problem, first described in (Hochreiter and Schmidhuber 1997).
Dataset Splits Yes The test and validation sets both contain 100, 000 examples.
Hardware Specification Yes This amounts to the gu RNN processing 61.2 and 37.0 examples per second in the two tasks, on a Ge Force GTX 1080 GPU.
Software Dependencies No The paper mentions 'implemented in Python', 'Tensor Flow', 'scipy builtin expm', and 'eigh (also in scipy)', but does not provide specific version numbers for these software components.
Experiment Setup Yes In practice we set σ2 = 0.01 and use a fixed learning rate of 0.001. For each experimental run (a single U), we generate one million training {xj, yj} pairs, divided into batches of size 20. For our model (gu RNN), we use f = relu and f = tanh for the nonlinearities... We used β = 1.4 and β = 1.05... The learning rate was set to α = 10 3 for all models except IRNN, which used α = 10 4. We used RMSProp (Tieleman and Hinton 2012) with decay 0.9 and no momentum. The batch size was 20.