Monotone operator equilibrium networks
Authors: Ezra Winston, J. Zico Kolter
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To test the expressive power and training stability of mon DEQs, we evaluate the mon DEQ instantiations described in Section 4 on several image classification benchmarks. We take as a point of comparison the Neural ODE (NODE) [8] and Augmented Neural ODE (ANODE) [10] models, the only other implicit-depth models which guarantee the existence and uniqueness of a solution. We also assess the stability of training standard DEQs of the same form as our mon DEQs. |
| Researcher Affiliation | Collaboration | Ezra Winston School of Computer Science Carnegie Mellon University Pittsburgh, United States ewinston@cs.cmu.edu J. Zico Kolter School of Computer Science Carnegie Mellon University & Bosch Center for AI Pittsburgh, United States zkolter@cs.cmu.edu |
| Pseudocode | Yes | Algorithm 1 Forward-backward equilibrium solving; Algorithm 2 Peaceman-Rachford equilibrium solving; Algorithm 3 Forward-backward equilibrium backpropagation; Algorithm 4 Peaceman-Rachford equilibrium backpropagation |
| Open Source Code | Yes | Code is available at http://github.com/locuslab/monotone_op_net. |
| Open Datasets | Yes | We train small mon DEQs on CIFAR-10 [17], SVHN [22], and MNIST [18], with a similar number of parameters as the ODE-based models reported in [8] and [10]. |
| Dataset Splits | No | The paper mentions training and testing, but does not provide explicit details about validation splits, percentages, or sample counts needed to reproduce the data partitioning. |
| Hardware Specification | Yes | All experiments are run on a single RTX 2080 Ti GPU. |
| Software Dependencies | No | The paper does not provide specific version numbers for any software dependencies (e.g., programming languages, libraries, or frameworks like PyTorch, TensorFlow). |
| Experiment Setup | Yes | For further training details and model architectures see Appendix E. |