Multidimensional Shape Constraints

Authors: Maya Gupta, Erez Louidor, Oleksandr Mangylov, Nobu Morioka, Taman Narayan, Sen Zhao

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental Real-world experiments illustrate how the different shape constraints can be used to increase explainability and improve regularization, especially for non-IID train-test distribution shift. We present experiments on public and proprietary real-world problems illustrating training with the proposed shape constraints, including the first public experimental evidence with Edgeworth constraints (Cotter et al. (2019a) only presented experiments with trapezoid constraints).
Researcher Affiliation Industry 1Google Research, Mountain View, California, USA. Correspondence to: Sen Zhao <senzhao@google.com>.
Pseudocode No The paper does not contain structured pseudocode or clearly labeled algorithm blocks.
Open Source Code Yes Open-source code has been pushed to the Tensor Flow Lattice 2.0 library and can be downloaded at github.com/tensorflow/lattice.
Open Datasets Yes We compare models that forecast next week s sales based on sales in the past weeks (Kaggle, 2020c). In this experiment, we compare the performance of different models in predicting the daily success probability of climbing Mount Rainier (Kaggle, 2020a). The Google Play Store Apps dataset (Kaggle, 2020b). probability that a person passes the bar given their GPA (horizontal axis) and LSAT test score (vertical axis) (Wightman, 1998).
Dataset Splits Yes Results in Fig. 4 are averaged over 100 random 80-20 train/test splits. The first set used a non IID train/test split based on a 6th piece of information, the app category: we trained on the most common category 'Family'(18 percent of samples), and tested on the other app categories (82 percent of samples).
Hardware Specification No The paper does not provide specific hardware details (e.g., exact GPU/CPU models, processor types, memory amounts, or detailed computer specifications) used for running its experiments.
Software Dependencies Yes We use projected stochastic gradient descent in Tensor Flow, and the Tensor Flow Lattice 2.0 library s PLF layers and lattice layers.
Experiment Setup Yes For all experiments, we used the default ADAM stepsize of .001, ran the optimization of (3) until train loss converged, and used squared error as the training loss l in (3). All models that use PLFs use 10 keys Kd = 10 for each PLF, fixed before training to the train data quantiles. All models were trained for 100 epochs.