Causal Customer Churn Analysis with Low-rank Tensor Block Hazard Model

Authors: Chenyin Gao, Zhiming Zhang, Shu Yang

ICML 2024 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental The efficacy and superiority of our model are further validated through comprehensive experiments on both simulated and real-world applications.
Researcher Affiliation Academia 1Department of Statistics, North Carolina State University, Raleigh, U.S.A. 2Independent scholar, Iowa State University, Ames, U.S.A.
Pseudocode Yes Algorithm 1 Projected gradient descent and spectral clustering for minimizing (4)
Open Source Code Yes Our implementation codes will be made publicly available after the acceptance of this manuscript. Our Python codes with illustrative examples are available at https://github.com/Gaochenyin/Low-Rank-Tensor-Block-Hazard-Model
Open Datasets Yes In this section, we apply our proposed method to one bank customer churn data from a Kaggle competition (https://www.kaggle.com/datasets/radheshyamkollipara/bankcustomer-churn). We provide an additional real-data application involving customer churn analysis of online retail. The data (https://www.kaggle.com/datasets/ankitverma2010/ecommerce-customer-churn-analysis-and-prediction) are collected by an E-commerce company.
Dataset Splits Yes For the model training, we divide the dataset into 80% training data and 20% test data, implementing 5-fold cross-validation.
Hardware Specification Yes All experiments are conducted on a computer with an Intel(R) Xeon(R) Gold 6226R CPU @ 2.90GHz and 32GB RAM.
Software Dependencies No The paper mentions using packages like “sklearn” and “sksurv” in Section 6, but it does not specify their version numbers or the version of Python used, which is necessary for reproducibility.
Experiment Setup Yes For starter, the baseline covariates X RN d are generated by Xi i.i.d. N(0, Id) with d = 3. We generate the true parameter tensor Θ with each entry θi,t,l defined by: θi,t,l = (X i ηN) (tηT /T) cum{(l)2}ηL, where ηN = (1, 1, 1), ηT = 1, ηL = 1, and cum{(l)2} indicates the number of active treatments in (l)2. We consider the setting where T = 5, 10, N = 100, 300, 500, 1000, 2000, and k = 2, 3, 4.