reproducibilityindex.ai

On the Expected Complexity of Maxout Networks

Authors: Hanna Tseran, Guido F. Montufar

NeurIPS 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We investigate different parameter initialization procedures and show that they can increase the speed of convergence in training.
Researcher Affiliation	Academia	Hanna Tseran Max Planck Institute for Mathematics in the Sciences 04103 Leipzig, Germany hanna.tseran@mis.mpg.de; Guido Montúfar Department of Mathematics and Department of Statistics, UCLA Los Angeles, CA 90095, USA; Max Planck Institute for Mathematics in the Sciences 04103 Leipzig, Germany montufar@math.ucla.edu
Pseudocode	Yes	Algorithm for counting activation regions Several approaches for counting linear regions of Re LU networks have been considered (e.g., Serra et al., 2018; Hanin and Rolnick, 2019b; Serra and Ramalingam, 2020; Xiong et al., 2020). For maxout networks we count the activation regions and pieces of the decision boundary by iterative addition of linear inequality constraints and feasibility veriﬁcation using linear programming. Pseudocode and complexity analysis are provided in Appendix I.
Open Source Code	Yes	The computer implementation of the key functions is available on Git Hub at https://github.com/ hanna-tseran/maxout_complexity.
Open Datasets	Yes	We consider the 10 class classiﬁcation task with the MNIST dataset (Le Cun et al., 2010)
Dataset Splits	Yes	We use the standard train/validation/test split of 50000/10000/10000, respectively.
Hardware Specification	Yes	All experiments were run on an NVIDIA RTX A6000 GPU or on the Max Planck Computing and Data Facility (MPCDF) cluster.
Software Dependencies	No	The paper mentions general software like PyTorch, NumPy, SciPy, Matplotlib, and Python but does not provide specific version numbers for any of these dependencies required for reproduction.
Experiment Setup	Yes	We train our networks on the MNIST dataset, with batch size 100, for 100 epochs, using Adam optimizer with learning rate 0.001.