Expressive Monotonic Neural Networks

Authors: Niklas Nolte, Ouail Kitouni, Mike Williams

ICLR 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We show how the algorithm is used to train powerful, robust, and interpretable discriminators that achieve competitive performance compared to current state-of-the-art methods across various benchmarks, from social applications to the classification of the decays of subatomic particles produced at the CERN Large Hadron Collider. In this section, we test our algorithm on many different domains to show that it works well in practice and gives competitive results, as should be expected from a universal approximator. As can be seen in Table 1, our Lipschitz monotonic networks perform competitively or better than the state-of-the-art on all benchmarks we tried.
Researcher Affiliation Academia Niklas Nolte , Ouail Kitouni , Mike Williams The NSF AI Institute for Artificial Intelligence and Fundamental Interactions Massachusetts Institute of Technology Cambridge, MA 02139, USA {nnolte, kitouni,mwill}@mit.edu
Pseudocode No The paper describes the mathematical formulations and architectural details of the proposed method but does not include any structured pseudocode or algorithm blocks.
Open Source Code Yes All experiments with public datasets are reproducible with the code provided at https://github.com/niklasnolte/monotonic_tests.
Open Datasets Yes COMPAS (Larson & Kirchner, 2016), Blog Feedback (Buza, 2014), Loan Defaulter (Kaggle, 2015), and Chest XRay (Wang et al., 2017). From Sivaraman et al. (2020) we compare against one regression and one classification task: Auto MPG (Dua & Graff, 2017) and Heart Disease (Gennari et al., 1989).
Dataset Splits No The paper refers to 'Test Acc' in Table 1 and discusses training accuracy, but it does not explicitly mention or provide details for a validation split for the datasets used in the experiments. While standard datasets often have predefined splits including validation, the paper does not specify these within its text.
Hardware Specification No The paper does not provide any specific details about the hardware (e.g., CPU, GPU models, memory) used to run the experiments. It mentions the CERN Large Hadron Collider but in the context of data source and application domain, not the computational hardware for the models.
Software Dependencies No Appendix A mentions: 'We use Adam with default hyper-parameters im all experiments.' While it names an optimizer (Adam), it does not provide its version number or versions for other key software components or libraries (e.g., Python, TensorFlow, PyTorch).
Experiment Setup Yes Table 2: Training MNIST and CIFAR10/100 to 100% training accuracy with Lipschitz networks. Task: MNIST, CIFAR10, CIFAR100/101. Width: 1024. Depth: 3. LR: 10^-5. EPOCHS: 10^5. Batchsize: ALL. Loss: CE(τ = 256). We use Adam with default hyper-parameters im all experiments.