Real-Valued Backpropagation is Unsuitable for Complex-Valued Neural Networks

Authors: Zhi-Hao Tan, Yi Xie, Yuan Jiang, Zhi-Hua Zhou

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Finally, the experiments validate our theoretical findings numerically.
Researcher Affiliation Academia Zhi-Hao Tan, Yi Xie, Yuan Jiang, Zhi-Hua Zhou National Key Laboratory for Novel Software Technology, Nanjing University, Nanjing 210023, China {tanzh, xiey, jiangy, zhouzh}@lamda.nju.edu.cn
Pseudocode No The paper provides a conceptual "Definition 1 (Complex Tensor Program)" which describes how complex tensor programs are recursively generated, but it does not present a structured pseudocode block or algorithm.
Open Source Code No Did you include the code, data, and instructions needed to reproduce the main experimental results (either in the supplemental material or as a URL)? [No]
Open Datasets Yes The third experiment investigates the convergence of difference between complex NTKs bΘ(n) t and real NTKs Θr during training as the widths go to infinity on MNIST [Le Cun et al., 1998].
Dataset Splits No The paper mentions using a "training set D = (X, Y) (|D| = 128)" from MNIST but does not specify how this dataset was split into training, validation, and test subsets, nor does it mention cross-validation details.
Hardware Specification No Did you include the total amount of compute and the type of resources used (e.g., type of GPUs, internal cluster, or cloud provider)? [N/A] The numerical experiments only aim to verify the theoretical results.
Software Dependencies No All empirical NTKs of complex networks are calculated based on the Neural Tangents library [Novak et al., 2019]. (No specific version is given for this or any other software component).
Experiment Setup Yes In NTK initialization, the standard deviations are set as 1 for complex networks and scaled to sqrt(2) for real networks. ... The learning rate η is 0.5 for l = 1 and 0.2 for l = 2.