Efficiently Learning One Hidden Layer ReLU Networks From Queries
Authors: Sitan Chen, Adam Klivans, Raghu Meka
NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | This work gives the first polynomial-time algorithm for learning one hidden layer neural networks provided black-box access to the network. Formally, we show that if F is an arbitrary one hidden layer neural network with Re LU activations, there is an algorithm with query complexity and running time that is polynomial in all parameters that outputs a network F achieving low square loss relative to F with respect to the Gaussian measure. While a number of works in the security literature have proposed and empirically demonstrated the effectiveness of certain algorithms for this problem, ours is the first with fully polynomial-time guarantees of efficiency for worst-case networks (in particular our algorithm succeeds in the overparameterized setting). |
| Researcher Affiliation | Academia | Sitan Chen sitanc@mit.edu MIT Adam R. Klivans klivans@cs.utexas.edu UT Austin Raghu Meka raghum@cs.ucla.edu UCLA |
| Pseudocode | Yes | Algorithm 1: GETBIAS(L, I) ... Algorithm 2: GETGRADIENT(x, α) ... Algorithm 3: GETNEURONS(ϵ, δ) ... Algorithm 4: LEARNFROMQUERIES(ϵ, δ) |
| Open Source Code | No | The paper does not provide any statement or link indicating that source code for the methodology described is publicly available. |
| Open Datasets | No | The paper is theoretical and focuses on algorithmic guarantees for learning neural networks from queries, rather than conducting empirical studies on specific datasets. It does not mention or provide access information for any publicly available or open dataset used for training or evaluation. |
| Dataset Splits | No | The paper is theoretical and does not describe empirical experiments with dataset splits for training, validation, or testing. Therefore, it does not provide specific information about dataset splits. |
| Hardware Specification | No | The paper is theoretical and does not describe the execution of experiments or require specific hardware. Therefore, no hardware specifications are mentioned. |
| Software Dependencies | No | The paper is theoretical and does not describe an implementation that would require specific software dependencies with version numbers. No such details are provided. |
| Experiment Setup | No | The paper is theoretical and focuses on algorithmic design and guarantees, rather than empirical experimentation. Therefore, it does not include details about an experimental setup, such as hyperparameters or training configurations. |