OPTIMAL ROBUST MEMORIZATION WITH RELU NEURAL NETWORKS
Authors: Lijia Yu, Xiao-Shan Gao, Lijun Zhang
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | First, we prove that it is NP-hard to compute neural networks with certain simple structures, which are robust memorization. A network hypothesis space is called optimal robust memorization for a dataset if it can achieve robust memorization for any budget less than half the separation bound of the dataset. Second, we explicitly construct neural networks with O(Nn) parameters for optimal robust memorization of any dataset with dimension n and size N. We also give a lower bound for the width of networks to achieve optimal robust memorization. Finally, we explicitly construct neural networks with O(Nn log n) parameters for optimal robust memorization of any binary classification dataset by controlling the Lipschitz constant of the network. |
| Researcher Affiliation | Academia | Lijia Yu Institute of Software, Chinese Academy of Sciences Academy of Mathematics and Systems Science, Chinese Academy of Sciences yulijia@ios.ac.cn Xiao-Shan Gao Academy of Mathematics and Systems Science, Chinese Academy of Sciences University of Chinese Academy of Sciences xgao@mmrc.iss.ac.cn Lijun Zhang Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences zhanglj@ios.ac.cn |
| Pseudocode | No | While the paper describes the construction of networks, it does so using mathematical formulas and descriptions (e.g., “F will be defined in three steps for an input x Rn”) rather than formal pseudocode blocks. |
| Open Source Code | No | The paper is purely theoretical and focuses on mathematical proofs and constructions. There is no mention of an implementation or associated source code. |
| Open Datasets | No | The paper is theoretical and does not perform experiments on real-world datasets. Datasets are defined abstractly as D = {(xi, yi)}N i=1 Rn [L] for the purpose of theoretical analysis and proofs. |
| Dataset Splits | No | As the paper is theoretical, it does not involve data splits for training, validation, or testing. |
| Hardware Specification | No | No hardware specifications are provided as no experiments are conducted. |
| Software Dependencies | No | No software dependencies are provided as no implementation or experiments are conducted. |
| Experiment Setup | No | As a theoretical paper, it does not include details on experimental setup. |