Model Distillation for Revenue Optimization: Interpretable Personalized Pricing

Authors: Max Biggs, Wei Sun, Markus Ettl

ICML 2021 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We quantify the regret of a resulting policy and demonstrate its efficacy in applications with both synthetic and real-world datasets.
Researcher Affiliation Collaboration 1Darden School of Business, University of Virginia, Virginia, USA. 2IBM Research, Yorktown Heights, New York, USA.
Pseudocode No The paper describes the 'Recursive partitioning algorithm for student prescriptive trees' in Section 3.3 in narrative text, but it does not provide formal pseudocode or an algorithm block.
Open Source Code No The paper mentions a lack of open-source code for a benchmark ('Due to the lack of open-source code for Kallus, 2017'), but it does not state that its own source code is openly available or provide a link to it.
Open Datasets Yes We benchmark the algorithms on a publicly available retail dataset collated by the analytics firm Dunnhumby.3 The complete journey dataset contains household level transactions over two years from a group of 2,500 households... 3https://www.dunnhumby.com/careers/engineering/sourcefiles
Dataset Splits No For simulated datasets, the paper mentions that 'The number of training samples is held constant at n = 5000' and explores varying 'training data size'. For the Dunnhumby dataset, it states 'We divide the dataset in half' for evaluation purposes. However, it does not explicitly specify a distinct validation set or its split percentage/size for hyperparameter tuning or model selection for its own proposed method.
Hardware Specification No The paper does not provide any specific details about the hardware used for running the experiments, such as GPU or CPU models, memory, or cloud instance types. It only mentions using the 'light GBM package'.
Software Dependencies No The paper mentions using 'the light GBM package (Ke et al., 2017)' as the teacher model. However, it does not specify a version number for LightGBM or any other software dependencies, which would be necessary for reproducibility.
Experiment Setup Yes For the teacher model, we train a gradient boosted tree ensemble model using the light GBM package (Ke et al., 2017). We use default parameter values, with 50 boosting rounds. The discretized price set is set to 9 prices, ranging from the 10th to the 90th percentile of observed prices in 10% increments. [...] We use tree depth as termination criteria which corresponds to the desired complexity of the pricing policy, but can easily incorporate other commonly used approaches such as minimum leaf size criterion, or minimum improvement in impurity/revenue. In all experiments each tree was grown to the full width for a given depth, i.e., 2k leaves for a tree of depth k. [...] We explore how the expected revenue changes with the depth of a tree (k = {1, 2, 3, 4, 5}). [...] For each tree depth, we run 10 independent simulations for each dataset. [...] To show that our splitting criteria performs well with alternative termination criteria and to provide further insight into the interpretability of our approach, we show results where the termination criteria is determined by minsplit if the number of observations at a potential split is less than a threshold, no further splits will occur on that branch.