Subhomogeneous Deep Equilibrium Models

Authors: Pietro Sittoni, Francesco Tudisco

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our theoretical findings are complemented by several experimental evaluations where we compare simple fully-connected and convolutional DEQ architectures based on monotone operators with the newly introduced subhomogeneous deep equilibrium model (Sub DEQ) on benchmark image classification tasks.
Researcher Affiliation Academia 1School of Mathematics, Gran Sasso Science Institute, L Aquila, Italy 2School of Mathematics and Maxwell Institute, University of Edinburgh, Edinburgh, UK.
Pseudocode No The paper does not contain any clearly labeled pseudocode or algorithm blocks.
Open Source Code Yes All the models are implemented in Py Torch and the code is available at https://github.com/COMPi LELab/Sub DEQ.
Open Datasets Yes We train them on different image benchmark datasets: CIFAR-10 (Krizhevsky & Hinton, 2009), SVHN (Netzer et al., 2011), and MNIST (Le Cun & Cortes, 2010).
Dataset Splits Yes Regarding the data, we use the hold-out approach, dividing the dataset into train validation and test sets. Appendix C describes the proportion of the splittings. Table 6. Dataset Train set size Validation set size Test set size MNIST 71 % 14.5 % 14.5 % CIFAR-10 70 % 15 % 15 % SVHN 50.358 % 23.4235 % 26.2184 %
Hardware Specification No No specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running experiments were mentioned in the paper.
Software Dependencies No The paper states, 'All the models are implemented in Py Torch,' but does not provide specific version numbers for PyTorch or any other software dependencies.
Experiment Setup Yes Table 8 provides detailed 'Models Hyperparameter' including 'Number of input channels', 'Number of hidden channels', 'Size of hidden channels', 'Hidden kernel size', 'Input kernel size', 'Dimension of input weight matrix', 'Dimension of hidden weight matrix', 'Average pooling', 'Epochs', 'Initial learning rate', 'Learning rate schedule', 'minimum learning rate', 'Weight decay', and 'Batch size'.