Fast and Flexible Monotonic Functions with Ensembles of Lattices

Authors: Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation. and "7 Experiments We demonstrate the proposals on four datasets."
Researcher Affiliation Industry K. Canini, A. Cotter, M. R. Gupta, M. Milani Fard, J. Pfeifer Google Inc. 1600 Amphitheatre Parkway, Mountain View, CA 94043 {canini,acotter,mayagupta,janpf,mmilanifard}@google.com
Pseudocode No The paper describes steps for training the lattices in a numbered list format but does not present them as a formal pseudocode block or algorithm.
Open Source Code No The paper does not provide an explicit statement or link for the open-sourcing of the code for its described methodology.
Open Datasets Yes Dataset 1 is the ADULT dataset from the UCI Machine Learning Repository [19]
Dataset Splits Yes We split 800k labelled samples based on time, using the 500k oldest samples for a training set, the next 100k samples for a validation set, and the most recent 200k samples for a testing set (so the three datasets are not IID).
Hardware Specification No The paper discusses evaluation speed and memory usage, but does not provide specific hardware details (e.g., CPU/GPU models, memory size, or cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions using 'Light Touch [20]' and 'a C++ package implementation for RF' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup Yes The best RF on the validation set used 350 trees with a leaf size of 1 and the best Crystals model used 350 lattices with 6 features per lattice. All models were trained using logistic loss, mini-batch size of 100, and 200 loops. For each model, we chose the optimization algorithms step sizes by finding the power of 2 that maximized accuracy on the validation set.