OPTIMAL ROBUST MEMORIZATION WITH RELU NEURAL NETWORKS

Authors: Lijia Yu, Xiao-Shan Gao, Lijun Zhang

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical First, we prove that it is NP-hard to compute neural networks with certain simple structures, which are robust memorization. A network hypothesis space is called optimal robust memorization for a dataset if it can achieve robust memorization for any budget less than half the separation bound of the dataset. Second, we explicitly construct neural networks with O(Nn) parameters for optimal robust memorization of any dataset with dimension n and size N. We also give a lower bound for the width of networks to achieve optimal robust memorization. Finally, we explicitly construct neural networks with O(Nn log n) parameters for optimal robust memorization of any binary classification dataset by controlling the Lipschitz constant of the network.
Researcher Affiliation Academia Lijia Yu Institute of Software, Chinese Academy of Sciences Academy of Mathematics and Systems Science, Chinese Academy of Sciences yulijia@ios.ac.cn Xiao-Shan Gao Academy of Mathematics and Systems Science, Chinese Academy of Sciences University of Chinese Academy of Sciences xgao@mmrc.iss.ac.cn Lijun Zhang Institute of Software, Chinese Academy of Sciences University of Chinese Academy of Sciences zhanglj@ios.ac.cn
Pseudocode No While the paper describes the construction of networks, it does so using mathematical formulas and descriptions (e.g., “F will be defined in three steps for an input x Rn”) rather than formal pseudocode blocks.
Open Source Code No The paper is purely theoretical and focuses on mathematical proofs and constructions. There is no mention of an implementation or associated source code.
Open Datasets No The paper is theoretical and does not perform experiments on real-world datasets. Datasets are defined abstractly as D = {(xi, yi)}N i=1 Rn [L] for the purpose of theoretical analysis and proofs.
Dataset Splits No As the paper is theoretical, it does not involve data splits for training, validation, or testing.
Hardware Specification No No hardware specifications are provided as no experiments are conducted.
Software Dependencies No No software dependencies are provided as no implementation or experiments are conducted.
Experiment Setup No As a theoretical paper, it does not include details on experimental setup.