Representation Costs of Linear Neural Networks: Analysis and Design

Authors: Zhen Dai, Mina Karzand, Nathan Srebro

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Theoretical In this paper, we study how different parameterizations induce different complexity measures. ... The limitation of our work includes the fact that we are considering only a rather specific set of architectures, and in particular only linear models. So our study is mostly meant to build tools and understanding and set the stage for understanding more complex non-linear models.
Researcher Affiliation Academia Zhen Dai Committee on Computational and Applied Mathematics University of Chicago Chicago, IL 60637 zhen9@uchicago.edu Mina Karzand Department of Statistics University of California, Davis Davis, CA 95616 mkarzand@ucdavis.edu Nathan Srebro Toyota Technological Institute at Chicago Chicago, IL 60637 nati@ttic.edu
Pseudocode No The paper does not contain any pseudocode or clearly labeled algorithm blocks. It presents mathematical definitions, theorems, and proofs.
Open Source Code No The paper is theoretical and does not describe a method that involves releasing source code. There is no mention of open-source code availability for the described methodology or links to code repositories.
Open Datasets No The paper is theoretical and does not involve experimental evaluation using datasets. Thus, there is no information about public dataset availability for training.
Dataset Splits No The paper is theoretical and does not involve experiments with dataset splits. Therefore, no training/validation/test dataset split information is provided.
Hardware Specification No The paper is theoretical and does not involve computational experiments, so no hardware specifications are mentioned.
Software Dependencies No The paper is theoretical and focuses on mathematical analysis and design principles. It does not mention any specific software dependencies with version numbers.
Experiment Setup No The paper is purely theoretical, presenting mathematical analysis and design. It does not describe any experimental setup details, hyperparameters, or training configurations.