reproducibilityindex.ai

Fast and Flexible Monotonic Functions with Ensembles of Lattices

Authors: Mahdi Milani Fard, Kevin Canini, Andrew Cotter, Jan Pfeifer, Maya Gupta

NeurIPS 2016 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	We demonstrate that compared to random forests, these ensembles produce similar or better accuracy, while providing guaranteed monotonicity consistent with prior knowledge, smaller model size and faster evaluation. and "7 Experiments We demonstrate the proposals on four datasets."
Researcher Affiliation	Industry	K. Canini, A. Cotter, M. R. Gupta, M. Milani Fard, J. Pfeifer Google Inc. 1600 Amphitheatre Parkway, Mountain View, CA 94043 {canini,acotter,mayagupta,janpf,mmilanifard}@google.com
Pseudocode	No	The paper describes steps for training the lattices in a numbered list format but does not present them as a formal pseudocode block or algorithm.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-sourcing of the code for its described methodology.
Open Datasets	Yes	Dataset 1 is the ADULT dataset from the UCI Machine Learning Repository [19]
Dataset Splits	Yes	We split 800k labelled samples based on time, using the 500k oldest samples for a training set, the next 100k samples for a validation set, and the most recent 200k samples for a testing set (so the three datasets are not IID).
Hardware Specification	No	The paper discusses evaluation speed and memory usage, but does not provide specific hardware details (e.g., CPU/GPU models, memory size, or cloud instance types) used for running the experiments.
Software Dependencies	No	The paper mentions using 'Light Touch [20]' and 'a C++ package implementation for RF' but does not provide specific version numbers for these or other software dependencies.
Experiment Setup	Yes	The best RF on the validation set used 350 trees with a leaf size of 1 and the best Crystals model used 350 lattices with 6 features per lattice. All models were trained using logistic loss, mini-batch size of 100, and 200 loops. For each model, we chose the optimization algorithms step sizes by ﬁnding the power of 2 that maximized accuracy on the validation set.