projUNN: efficient method for training deep networks with unitary matrices
Authors: Bobak Kiani, Randall Balestriero, Yann LeCun, Seth Lloyd
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We propose in this section a variety of benchmarked experiments to validate the efficiency and performance of the proposed PROJUNN method focusing mostly on RNN tasks.1 We include further details of the experiments in Appendix D including a preliminary empirical analysis of PROJUNN in convolutional tasks. |
| Researcher Affiliation | Collaboration | Bobak T. Kiani MIT bkiani@mit.edu Randall Balestriero Meta AI, FAIR rbalestriero@fb.com Yann Le Cun NYU & Meta AI, FAIR yann@fb.com Seth Lloyd MIT & Turing Inc. slloyd@mit.edu |
| Pseudocode | Yes | Algorithm 1 PROJUNN update step |
| Open Source Code | Yes | code repository: https://github.com/facebookresearch/proj UNN |
| Open Datasets | Yes | Toy model: learning random unitary... Adding task... Copy memory task... Permuted MNIST... CNN experiments... on CIFAR10 classification using a Resnet architecture... MNIST data. |
| Dataset Splits | Yes | 10% of the training set (same for all models) is set apart as validation set. |
| Hardware Specification | No | The provided text excerpt states "Relevant details are included in Appendix G.", but Appendix G is not included in the provided text, so specific hardware details cannot be confirmed. |
| Software Dependencies | No | The paper mentions software like Tensor Flow and PyTorch, but does not specify their version numbers or other specific software dependencies with versions required for reproducibility. |
| Experiment Setup | Yes | Consistent with [35], we train our PROJUNN-T using an RNN with hidden dimension of 170 and the RMSprop optimizer to reduce the mean-squared error of the output with respect to the target... train networks with batch size 128 using the RMSProp algorithm... Training occurs for 200 epochs... the learning rate for unitary parameters was set to 32 times less than that of regular parameters |