Scalable Monotonic Neural Networks
Authors: Hyunho Kim, Jong-Seok Lee
ICLR 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Numerical experiments demonstrated that our method achieved comparable prediction accuracy to the state-of-the-art approaches while effectively addressing the aforementioned weaknesses. 4 NUMERICAL EXPERIMENTS |
| Researcher Affiliation | Academia | Hyunho Kim, Jong-Seok Lee Department of Industrial Engineering Sungkyunkwan University Suwon, Republic of Korea {retna319,jongseok}@skku.edu |
| Pseudocode | No | The paper describes the network structure and mathematical formulations but does not provide a dedicated pseudocode or algorithm block. |
| Open Source Code | Yes | All implemented code can be found in the supplements, which is also available at https://github.com/retna319/SMNN. |
| Open Datasets | Yes | The Auto-MPG2 and Blog Feedback (Spiliopoulou et al., 2014) datasets were used for regression tasks, while the COMPAS (Angwin et al., 2016), Heart Disease3, and Loan Defaulter4 datasets were employed for classification. ... 2https://archive.ics.uci.edu/ml/datasets/auto+mpg 3https://archive.ics.uci.edu/ml/datasets/Heart+Disease 4https://www.kaggle.com/wendykan/lenging-club-loan-data |
| Dataset Splits | Yes | The dataset was divided into a 70% training set and a 30% test set. For each folding, we conducted five independent runs, which resulted in a total of 25 experimental trials. ... we employed a five-fold cross-validation strategy, using 80% of the data for training and 20% for testing. |
| Hardware Specification | Yes | All experiments were conducted on a system equipped with an Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz processor and 32.0GB of DDR3 RAM, running the Windows 10 operating system. |
| Software Dependencies | Yes | The experiments were implemented in Python, utilizing the Py Torch library5 (version 1.10.2). |
| Experiment Setup | Yes | Stochastic optimization for training networks was performed using the Adam optimizer. The best hyperparameters were determined through grid search, considering different batch sizes (128, 256, 512), learning rates (0.05, 0.005, 0.002, 0.0005). The neural network architecture consisted of two scalable monotonic hidden layers. ... For regression tasks, the mean squared error loss was employed, while for classification tasks, the cross-entropy loss was used. The number of epochs was set to either 1000 or 500 considering datasets. |