Learning Unitary Operators with Help From u(n)
Authors: Stephanie Hyland, Gunnar Rtsch
AAAI 2017 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We demonstrate the effectiveness of this parametrization on the problem of learning arbitrary unitary operators, comparing to several baselines and outperforming a recently-proposed lower-dimensional parametrization. We additionally use our parametrization to generalize a recently-proposed unitary recurrent neural network to arbitrary unitary matrices, using it to solve standard long-memory tasks. |
| Researcher Affiliation | Academia | Stephanie L. Hyland,1, 2 Gunnar R atsch1 1Department of Computer Science, ETH Zurich, Switzerland 2Tri-Institutional Training Program in Computational Biology and Medicine, Weill Cornell Medical, New York |
| Pseudocode | No | The paper describes derivations and processes using mathematical equations and descriptive text, but it does not include any explicitly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | The code is available here: https://github.com/ratschlab/u RNN. |
| Open Datasets | Yes | We create a n n unitary matrix U (the next section describes how this is done), then sample vectors x Cn with normally-distributed coefficients. We create yj = Uxj +ϵj where ϵ N(0, σ2). The objective is to recover U from the {xj, yj} pairs... The adding problem and the memory problem, first described in (Hochreiter and Schmidhuber 1997). |
| Dataset Splits | Yes | The test and validation sets both contain 100, 000 examples. |
| Hardware Specification | Yes | This amounts to the gu RNN processing 61.2 and 37.0 examples per second in the two tasks, on a Ge Force GTX 1080 GPU. |
| Software Dependencies | No | The paper mentions 'implemented in Python', 'Tensor Flow', 'scipy builtin expm', and 'eigh (also in scipy)', but does not provide specific version numbers for these software components. |
| Experiment Setup | Yes | In practice we set σ2 = 0.01 and use a fixed learning rate of 0.001. For each experimental run (a single U), we generate one million training {xj, yj} pairs, divided into batches of size 20. For our model (gu RNN), we use f = relu and f = tanh for the nonlinearities... We used β = 1.4 and β = 1.05... The learning rate was set to α = 10 3 for all models except IRNN, which used α = 10 4. We used RMSProp (Tieleman and Hinton 2012) with decay 0.9 and no momentum. The batch size was 20. |