Scalable Monotonic Neural Networks

Authors: Hyunho Kim, Jong-Seok Lee

ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Numerical experiments demonstrated that our method achieved comparable prediction accuracy to the state-of-the-art approaches while effectively addressing the aforementioned weaknesses. 4 NUMERICAL EXPERIMENTS
Researcher Affiliation Academia Hyunho Kim, Jong-Seok Lee Department of Industrial Engineering Sungkyunkwan University Suwon, Republic of Korea {retna319,jongseok}@skku.edu
Pseudocode No The paper describes the network structure and mathematical formulations but does not provide a dedicated pseudocode or algorithm block.
Open Source Code Yes All implemented code can be found in the supplements, which is also available at https://github.com/retna319/SMNN.
Open Datasets Yes The Auto-MPG2 and Blog Feedback (Spiliopoulou et al., 2014) datasets were used for regression tasks, while the COMPAS (Angwin et al., 2016), Heart Disease3, and Loan Defaulter4 datasets were employed for classification. ... 2https://archive.ics.uci.edu/ml/datasets/auto+mpg 3https://archive.ics.uci.edu/ml/datasets/Heart+Disease 4https://www.kaggle.com/wendykan/lenging-club-loan-data
Dataset Splits Yes The dataset was divided into a 70% training set and a 30% test set. For each folding, we conducted five independent runs, which resulted in a total of 25 experimental trials. ... we employed a five-fold cross-validation strategy, using 80% of the data for training and 20% for testing.
Hardware Specification Yes All experiments were conducted on a system equipped with an Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz processor and 32.0GB of DDR3 RAM, running the Windows 10 operating system.
Software Dependencies Yes The experiments were implemented in Python, utilizing the Py Torch library5 (version 1.10.2).
Experiment Setup Yes Stochastic optimization for training networks was performed using the Adam optimizer. The best hyperparameters were determined through grid search, considering different batch sizes (128, 256, 512), learning rates (0.05, 0.005, 0.002, 0.0005). The neural network architecture consisted of two scalable monotonic hidden layers. ... For regression tasks, the mean squared error loss was employed, while for classification tasks, the cross-entropy loss was used. The number of epochs was set to either 1000 or 500 considering datasets.