Subhomogeneous Deep Equilibrium Models
Authors: Pietro Sittoni, Francesco Tudisco
ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our theoretical findings are complemented by several experimental evaluations where we compare simple fully-connected and convolutional DEQ architectures based on monotone operators with the newly introduced subhomogeneous deep equilibrium model (Sub DEQ) on benchmark image classification tasks. |
| Researcher Affiliation | Academia | 1School of Mathematics, Gran Sasso Science Institute, L Aquila, Italy 2School of Mathematics and Maxwell Institute, University of Edinburgh, Edinburgh, UK. |
| Pseudocode | No | The paper does not contain any clearly labeled pseudocode or algorithm blocks. |
| Open Source Code | Yes | All the models are implemented in Py Torch and the code is available at https://github.com/COMPi LELab/Sub DEQ. |
| Open Datasets | Yes | We train them on different image benchmark datasets: CIFAR-10 (Krizhevsky & Hinton, 2009), SVHN (Netzer et al., 2011), and MNIST (Le Cun & Cortes, 2010). |
| Dataset Splits | Yes | Regarding the data, we use the hold-out approach, dividing the dataset into train validation and test sets. Appendix C describes the proportion of the splittings. Table 6. Dataset Train set size Validation set size Test set size MNIST 71 % 14.5 % 14.5 % CIFAR-10 70 % 15 % 15 % SVHN 50.358 % 23.4235 % 26.2184 % |
| Hardware Specification | No | No specific hardware details such as GPU models, CPU types, or cloud instance specifications used for running experiments were mentioned in the paper. |
| Software Dependencies | No | The paper states, 'All the models are implemented in Py Torch,' but does not provide specific version numbers for PyTorch or any other software dependencies. |
| Experiment Setup | Yes | Table 8 provides detailed 'Models Hyperparameter' including 'Number of input channels', 'Number of hidden channels', 'Size of hidden channels', 'Hidden kernel size', 'Input kernel size', 'Dimension of input weight matrix', 'Dimension of hidden weight matrix', 'Average pooling', 'Epochs', 'Initial learning rate', 'Learning rate schedule', 'minimum learning rate', 'Weight decay', and 'Batch size'. |