Data-Dependent Path Normalization in Neural Networks

Authors: Behnam Neyshabur, Ryota Tomioka, Ruslan Salakhutdinov, Nathan Srebro

ICLR 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical The paper proposes a unified theoretical framework for neural network normalization, regularization, and optimization, and analyzes its properties and relationships to existing methods like Path-SGD and Batch-Normalization. It includes mathematical derivations and proofs (e.g., Theorem 1, 2, 3, 4, and calculations in Appendix A and B). While Appendix A is titled 'IMPLEMENTATION' and describes how calculations can be done efficiently, the paper does not present any empirical studies, experimental results, dataset evaluations, performance metrics, or comparisons based on actual runs, which are characteristic of experimental research.
Researcher Affiliation Collaboration Behnam Neyshabur Toyota Technological Institute at Chicago Chicago, IL 60637, USA bneyshabur@ttic.edu; Ryota Tomioka Microsoft Research Cambridge, UK ryoto@microsoft.com; Ruslan Salakhutdinov Department of Computer Science University of Toronto, Canada rsalakhu@cs.toronto.edu; Nathan Srebro Toyota Technological Institute at Chicago Chicago, IL 60637, USA nati@ttic.edu. Affiliations include 'Toyota Technological Institute at Chicago' (academic), 'Microsoft Research' (industry), and 'University of Toronto' (academic), indicating a collaboration.
Pseudocode No The paper includes Appendix A, 'IMPLEMENTATION', which describes computational steps and derivations for DDP-Normalization and DDP-SGD. However, these are presented as mathematical equations and textual descriptions, not as structured pseudocode blocks or algorithms with clearly defined inputs, outputs, and numbered steps.
Open Source Code No The paper does not contain any explicit statement about making source code publicly available, nor does it provide a link to a code repository.
Open Datasets No The paper is theoretical and does not describe or report on any empirical experiments involving datasets. Therefore, it does not mention specific datasets or their public availability for training.
Dataset Splits No The paper is theoretical and does not conduct experiments. Consequently, it does not provide details about training, validation, or test dataset splits.
Hardware Specification No The paper focuses on theoretical contributions and does not describe any experimental setups or the hardware used to perform computations.
Software Dependencies No The paper is theoretical and does not provide specific details about software dependencies or their version numbers required to replicate any implementation.
Experiment Setup No The paper is theoretical and does not describe any specific experimental setups, hyperparameters, or system-level training settings.