reproducibilityindex.ai

New Complexity-Theoretic Frontiers of Tractability for Neural Network Training

Authors: Cornelius Brand, Robert Ganian, Mathis Rocton

NeurIPS 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Theoretical	In this article we obtain novel algorithmic upper bounds for training linearand Re LU-activated neural networks to optimality which push the boundaries of tractability for these problems beyond the previous state of the art. In particular, for Re LU networks we establish the polynomial-time tractability of all architectures where hidden neurons have an out-degree of 1, improving upon the previous algorithm of Arora, Basu, Mianjy and Mukherjee. On the other hand, for networks with linear activation functions we identify the ﬁrst non-trivial polynomial-time solvable class of networks by obtaining an algorithm that can optimally train network architectures satisfying a novel data throughput condition. ... We have examined the theoretical boundaries of computational tractability for training neural networks both for the fundamental cases of Re LU and linear activations, complementing a ﬂurry of recent results establishing increasingly stronger lower bounds for network training.
Researcher Affiliation	Academia	Cornelius Brand Algorithms & Complexity Group Vienna University of Technology Favoritenstraße 9-11, 1040 Vienna, Austria cbrand@ac.tuwien.ac.at Robert Ganian Algorithms & Complexity Group Vienna University of Technology Favoritenstraße 9-11, 1040 Vienna, Austria rganian@gmail.com Mathis Rocton Algorithms & Complexity Group Vienna University of Technology Favoritenstraße 9-11, 1040 Vienna, Austria mrocton@ac.tuwien.ac.at
Pseudocode	No	The paper describes algorithms and proof sketches verbally and through theorems and lemmas (e.g., "The algorithm begins by exhaustively branching..."), but it does not include any formal pseudocode blocks or clearly labeled algorithm sections.
Open Source Code	No	The paper does not contain any statement about releasing source code for the methodology described, nor does it provide a link to a code repository.
Open Datasets	No	The paper is theoretical and focuses on computational complexity. It discusses neural network training in a formal context (e.g., "given a training data set containing m samples") but does not describe using or providing access to any specific dataset for empirical evaluation.
Dataset Splits	No	The paper is theoretical and does not present empirical experiments. Therefore, there are no mentions of specific training, validation, or test dataset splits.
Hardware Specification	No	The paper is theoretical and does not report on empirical experiments that would require specific hardware. Therefore, no hardware specifications are mentioned.
Software Dependencies	No	The paper is theoretical and does not report on empirical experiments or implementations that would require specific software dependencies with version numbers.
Experiment Setup	No	The paper is theoretical and does not present empirical experiments. Therefore, it does not include details about an experimental setup, hyperparameters, or training configurations.