Smooth Min-Max Monotonic Networks

Authors: Christian Igel

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments show that this does not come with a loss in generalization performance compared to alternative neural and non-neural approaches.
Researcher Affiliation Academia 1Department of Computer Science, University of Copenhagen, Copenhagen, Denmark. Correspondence to: Christan Igel <igel@diku.dk>.
Pseudocode No The paper does not contain structured pseudocode or algorithm blocks.
Open Source Code Yes All experiments, plots, tables, and statistics can be reproduced using the source code available from https:// github.com/christian-igel/SMM.
Open Datasets Yes We considered modelling partial monotone functions on real-world data sets from the UCI benchmark repository (Dua & Graff, 2017).Table 2. UCI regression data sets and constraints as considered by Yanagisawa et al. (2022). Energy (Tsanas & Xifara, 2012), QSAR (Cassotti et al., 2015), and Concrete (Yeh, 1998).
Dataset Splits Yes From each fold available for training, 25 % were used as a validation data set for early-stopping and final model selection, giving a 60:20:20 split in accordance with Yanagisawa et al. (2022).
Hardware Specification Yes Table C.9. Multivariate tasks, degrees of freedom of the neural networks and accumulated training times (on an Apple M1 Pro) in seconds for conducting 21 trials with 1000 training steps each.
Software Dependencies No The paper mentions software like 'Scikit-learn library' and 'XGBoost' and 'Rprop optimization algorithm' and refers to implementations by other authors, but it does not provide specific version numbers for any of these software dependencies.
Experiment Setup Yes We set the number of estimators in XGBoost to ntrees = 73 and ntrees = 35 (as the behavior was similar, we report only the results for ntrees = 73 in the following); for all other hyperparameters the default values were used. The weight parameters z(k,j) i and the bias parameters were randomly initialized by samples from a Gaussian distribution with zero mean and unit variance truncated to [ 2, 2]. We also used exponential encoding of β and initialize ln β with 1.