PenDer: Incorporating Shape Constraints via Penalized Derivatives

Authors: Akhil Gupta, Lavanya Marla, Ruoyu Sun, Naman Shukla, Arinbjörn Kolbeinsson11536-11544

AAAI 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Experiments on three real-world datasets illustrate that even though both Pen Der and state-of-the-art Lattice models achieve similar conformance to shape, Pen Der captures better sensitivity of prediction with respect to intended features. We also demonstrate that Pen Der achieves better test performance than Lattice while enforcing more desirable shape behavior. In this section, we present empirical results for the proposed Pen Der approach...
Researcher Affiliation Collaboration 1University of Illinois, Urbana-Champaign, IL, USA 2Deepair LLC, Dallas, TX, USA 3Imperial College London, London, UK
Pseudocode Yes Algorithm 1 Pen Der: Penalizing Derivatives
Open Source Code Yes Code and Appendix available within the Git Hub repository at https://github.com/deepair-io/Pen Der.
Open Datasets Yes Law School Admissions (Rankin 2020): A classification task to predict whether or not the applicant would get accepted to a particular law school. Used Cars (Leka 2019): Retrieved from e Bay-Germany, this dataset contains data of used cars for resale. Sberbank Russian Housing dataset (Sberbank 2017), in the Appendix E.
Dataset Splits Yes We use 20% of the entire dataset for test analysis, and perform a random 80-20 split on the remaining data to create training and validation datasets.
Hardware Specification Yes We train our models on a 2.7 GHz Dual-Core processor.
Software Dependencies No The paper mentions implementation in Tensor Flow, but does not provide specific version numbers for TensorFlow or any other software dependencies.
Experiment Setup Yes We train a 4-layer neural network (two hidden layers) for all our experiments (except Lattice). We tune hyperparameters such as optimizer, learning rate, and activation function on the DNN, and keep them same for Pen Der to facilitate a fair comparison. Adam (Kingma and Ba 2015) outperforms SGD on all the datasets. We employ early stopping (no model improvement for 40 epochs) as our stopping criterion, and decay the learning rate by a factor of 10 when validation error reaches a plateau.