On the Implicit Bias of Linear Equivariant Steerable Networks
Authors: Ziyu Chen, Wei Zhu
NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | We study the implicit bias of gradient flow on linear equivariant steerable networks in group-invariant binary classification. Our findings reveal that the parameterized predictor converges in direction to the unique group-invariant classifier with a maximum margin defined by the input group action. Under a unitary assumption on the input representation, we establish the equivalence between steerable networks and data augmentation. Furthermore, we demonstrate the improved margin and generalization bound of steerable networks over their non-invariant counterparts. |
| Researcher Affiliation | Academia | Ziyu Chen Department of Mathematics and Statistics University of Massachusetts Amherst Amherst, MA 01003 ziyuchen@umass.edu Wei Zhu Department of Mathematics and Statistics University of Massachusetts Amherst Amherst, MA 01003 weizhu@umass.edu |
| Pseudocode | No | The paper describes algorithms and mathematical formulations in text and equations, but does not provide structured pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not contain any statement about making its source code available, nor does it provide links to a code repository. |
| Open Datasets | No | The paper describes theoretical analysis and does not use or reference specific public datasets for experimental training. |
| Dataset Splits | No | The paper is theoretical and does not describe experimental dataset splits for training, validation, or testing. |
| Hardware Specification | No | The paper focuses on theoretical analysis and does not mention any specific hardware used for experiments. |
| Software Dependencies | No | The paper is theoretical and does not mention any specific software dependencies or versions. |
| Experiment Setup | No | The paper is theoretical and does not describe an experimental setup with hyperparameters or system-level training settings. |