Hierarchical Lattice Layer for Partially Monotone Neural Networks

Authors: Hiroki Yanagisawa, Kohei Miyaguchi, Takayuki Katsuki

NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Our experiments demonstrate that HLL did not sacrifice its prediction performance on real datasets compared with the lattice layer.
Researcher Affiliation Industry Hiroki Yanagisawa IBM Research Tokyo IBM Japan, Ltd. Tokyo, Japan yanagis@jp.ibm.com Kohei Miyaguchi IBM Research Tokyo IBM Japan, Ltd. Tokyo, Japan miyaguchi@ibm.com Takayuki Katsuki IBM Research Tokyo IBM Japan, Ltd. Tokyo, Japan kats@jp.ibm.com
Pseudocode No The paper does not contain any structured pseudocode or algorithm blocks.
Open Source Code Yes We used Python 3.8.12 and Py Torch 1.8.1 to implement MLP and HLL, and the implementation of HLL is available at https://github.com/IBM/pmlayer.
Open Datasets Yes We used 12 real datasets taken from the UCI Machine Learning Repository [Dua and Graff, 2017] in our experiments.
Dataset Splits Yes In our experiments, we split the data points into training, validation, and test data points. For the unsplit datasets, we divided the data points into training (60%), validation (20%), and test (20%). For the datasets that were already split into training and test datasets, we further divided the data points in the training dataset into training (80%) and validation (20%) and kept the test dataset unchanged.
Hardware Specification Yes All our experiments were conducted on a virtual machine with an Intel Xeon CPU (3.30 GHz) processor without any GPU and 64 GB of memory running Red Hat Enterprise Linux Server 7.6.
Software Dependencies Yes We used Python 3.8.12 and Py Torch 1.8.1 to implement MLP and HLL... we used Tensor Flow 2.3.0 and Tensor Flow Lattice 2.0.10.
Experiment Setup Yes Regarding the hyperparameters, we chose the best hyperparameters for each combination of a neural network and a dataset; the learning rate was chosen from {1, 10 1, 10 2, 10 3, 10 4, 10 5}, the batch size was chosen from {8, 16, 32, ..., 4096}, the number of neurons in the hidden layers was chosen from {16, 32, 64, ..., 512} for MLP and HLL, and the hyperparameter r was chosen from {4, 5, 6, 7} for TL RTL. We used the Adam optimizer [Kingma and Ba, 2015]. We trained these neural network models for 1000 epochs (small datasets) or 100 epochs (large datasets) with certain early stopping criteria.