Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Deep Network Approximation: Beyond ReLU to Diverse Activation Functions
Authors: Shijun Zhang, Jianfeng Lu, Hongkai Zhao
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | While these recently proposed activation functions have demonstrated promising empirical results, their theoretical underpinnings are still being developed. This paper aims to investigate the expressive capabilities of deep neural networks utilizing these activation functions. In doing so, we establish connections between these functions and Re LU, allowing us to extend most existing approximation results for Re LU networks to encompass other activation functions such as ELU and GELU. |
| Researcher Affiliation | Academia | Shijun Zhang EMAIL Department of Mathematics Duke University; Jianfeng Lu EMAIL Department of Mathematics Duke University; Hongkai Zhao EMAIL Department of Mathematics Duke University |
| Pseudocode | No | The paper describes mathematical proofs and theoretical derivations. There are no explicit sections or figures labeled "Pseudocode" or "Algorithm", nor are there any structured, code-like blocks detailing a procedure. |
| Open Source Code | No | The paper does not contain any explicit statements about releasing code for the described methodology, nor does it provide links to any code repositories in the main text or supplementary sections. |
| Open Datasets | No | The paper is theoretical, focusing on the expressive power of neural networks and mathematical proofs. It does not involve experimental evaluation on specific datasets, therefore no datasets are mentioned or made available. |
| Dataset Splits | No | Since no datasets are used for experimental evaluation in this theoretical paper, there are no dataset splits (training, validation, test) provided. |
| Hardware Specification | No | This paper presents theoretical results and mathematical proofs. No experiments are conducted, therefore no hardware specifications for running experiments are mentioned. |
| Software Dependencies | No | As a purely theoretical paper, there are no experimental implementations that would require specific software dependencies with version numbers to be listed. |
| Experiment Setup | No | The paper is focused on theoretical analysis and mathematical proofs, not experimental results. Therefore, there are no details regarding experimental setup, hyperparameters, or system-level training settings. |