Hierarchical Lattice Layer for Partially Monotone Neural Networks
Authors: Hiroki Yanagisawa, Kohei Miyaguchi, Takayuki Katsuki
NeurIPS 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Our experiments demonstrate that HLL did not sacrifice its prediction performance on real datasets compared with the lattice layer. |
| Researcher Affiliation | Industry | Hiroki Yanagisawa IBM Research Tokyo IBM Japan, Ltd. Tokyo, Japan yanagis@jp.ibm.com Kohei Miyaguchi IBM Research Tokyo IBM Japan, Ltd. Tokyo, Japan miyaguchi@ibm.com Takayuki Katsuki IBM Research Tokyo IBM Japan, Ltd. Tokyo, Japan kats@jp.ibm.com |
| Pseudocode | No | The paper does not contain any structured pseudocode or algorithm blocks. |
| Open Source Code | Yes | We used Python 3.8.12 and Py Torch 1.8.1 to implement MLP and HLL, and the implementation of HLL is available at https://github.com/IBM/pmlayer. |
| Open Datasets | Yes | We used 12 real datasets taken from the UCI Machine Learning Repository [Dua and Graff, 2017] in our experiments. |
| Dataset Splits | Yes | In our experiments, we split the data points into training, validation, and test data points. For the unsplit datasets, we divided the data points into training (60%), validation (20%), and test (20%). For the datasets that were already split into training and test datasets, we further divided the data points in the training dataset into training (80%) and validation (20%) and kept the test dataset unchanged. |
| Hardware Specification | Yes | All our experiments were conducted on a virtual machine with an Intel Xeon CPU (3.30 GHz) processor without any GPU and 64 GB of memory running Red Hat Enterprise Linux Server 7.6. |
| Software Dependencies | Yes | We used Python 3.8.12 and Py Torch 1.8.1 to implement MLP and HLL... we used Tensor Flow 2.3.0 and Tensor Flow Lattice 2.0.10. |
| Experiment Setup | Yes | Regarding the hyperparameters, we chose the best hyperparameters for each combination of a neural network and a dataset; the learning rate was chosen from {1, 10 1, 10 2, 10 3, 10 4, 10 5}, the batch size was chosen from {8, 16, 32, ..., 4096}, the number of neurons in the hidden layers was chosen from {16, 32, 64, ..., 512} for MLP and HLL, and the hyperparameter r was chosen from {4, 5, 6, 7} for TL RTL. We used the Adam optimizer [Kingma and Ba, 2015]. We trained these neural network models for 1000 epochs (small datasets) or 100 epochs (large datasets) with certain early stopping criteria. |