Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Classification with Deep Neural Networks and Logistic Loss
Authors: Zihan Zhang, Lei Shi, Ding-Xuan Zhou
JMLR 2024 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Theoretical | In this paper, we aim to fill this gap by developing a novel theoretical analysis and using it to establish tight generalization bounds for training fully connected ReLU DNNs with logistic loss in binary classification. Our generalization analysis is based on an elegant oracle-type inequality... we establish generalization bounds for fully connected ReLU classifiers... we obtain optimal convergence rates... Moreover, we consider a compositional assumption... Furthermore, we establish dimension-free rates of convergence... Besides the novel oracle-type inequality, the sharp convergence rates presented in our paper also owe to a tight error bound for approximating the natural logarithm function... In addition, we justify our claims for the optimality of rates by proving corresponding minimax lower bounds. |
| Researcher Affiliation | Academia | Zihan Zhang EMAIL Shanghai Center for Mathematical Sciences Fudan University, Shanghai 200433, China School of Data Science City University of Hong Kong, Kowloon, Hong Kong Lei Shi EMAIL School of Mathematical Sciences and Shanghai Key Laboratory for Contemporary Applied Mathematics Fudan University, Shanghai 200433, China Shanghai Artificial Intelligence Laboratory 701 Yunjin Road, Shanghai 200232, China Ding-Xuan Zhou EMAIL School of Mathematics and Statistics University of Sydney, Sydney NSW 2006, Australia |
| Pseudocode | No | The paper includes figures illustrating functions or networks (e.g., "Figure 2.1: An illustration of the function", "Figure C.1: Networks representing functions hk."), but these are visual representations rather than structured pseudocode or algorithm blocks describing a procedure. |
| Open Source Code | No | The paper includes a license statement: "License: CC-BY 4.0, see https://creativecommons.org/licenses/by/4.0/. Attribution requirements are provided at http://jmlr.org/papers/v25/22-0049.html." This refers to the paper's licensing and attribution, not the release of source code for the methodology described in the paper. No other statement regarding code availability is found. |
| Open Datasets | No | The paper is theoretical and focuses on generalization analysis and convergence rates for deep neural networks. It discusses binary classification problems and refers to properties of data distributions (e.g., conditional probability function). While it mentions "CIFAR10 data set" in a discussion comparing to other work, it does not use any dataset for experiments or provide access information for any dataset used in its own research. |
| Dataset Splits | No | The paper is theoretical and does not conduct empirical studies or experiments. Therefore, there are no dataset splits provided. |
| Hardware Specification | No | The paper is theoretical and focuses on mathematical analysis and proofs for deep neural networks. It does not describe any experiments that would require specific hardware. No hardware specifications are provided. |
| Software Dependencies | No | The paper is theoretical and focuses on mathematical analysis. It mentions software like "PyTorch" and "Caffe" in the context of discussing other related works or general deep learning practices, but it does not specify any software dependencies with version numbers for its own work or experiments. |
| Experiment Setup | No | The paper is theoretical and focuses on developing a novel generalization analysis and establishing mathematical bounds for deep neural networks. It does not describe any empirical experiments, and therefore, no experimental setup details, hyperparameters, or training configurations are provided. |