Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
The Causal Learning of Retail Delinquency
Authors: Yiyan Huang, Cheuk Hang Leung, Xing Yan, Qi Wu, Nanbo Peng, Dongdong Wang, Zhixiang Huang204-212
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | The proposed estimators are shown to be unbiased, consistent, and robust through a combination of theoretical analysis and numerical testing. Moreover, we compare the power of estimating the causal quantities between the classical estimators and the proposed estimators. The comparison is tested across a wide range of models, including linear regression models, tree-based models, and neural network-based models, under different simulated datasets that exhibit different levels of causality, different degrees of nonlinearity, and different distributional properties. Most importantly, we apply our approaches to a large observational dataset provided by a global technology firm that operates in both the e-commerce and the lending business. We find that the relative reduction of estimation error is strikingly substantial if the causal effects are accounted for correctly. |
| Researcher Affiliation | Collaboration | 1JD Digits 2City University of Hong Kong 3ISBD, Renmin University of China |
| Pseudocode | No | The paper describes mathematical formulations and derivations but does not include any pseudocode or algorithm blocks. |
| Open Source Code | No | The paper does not provide any statement or link for open-source code for the methodology described. |
| Open Datasets | No | We now apply our method to a unique real-world dataset kindly provided by JD Digits, one of the largest global technology firms that operates in both the e-commerce business and the lending business. |
| Dataset Splits | No | All results are out-of-sample and we use 70% of data as the training set and the remaining 30% as the testing set. The paper does not explicitly state a separate validation split percentage or count. |
| Hardware Specification | Yes | The experiments are run on two Ubuntu HP Z4 Workstations each with Intel Core i9 10-Core CPU at 3.3GHz, 128G DIMM-2166 ECC RAM, and two sets of NVIDIA Quadro RTX 5000 GPU. |
| Software Dependencies | No | The paper mentions using Python for experiments but does not provide specific version numbers for Python or any other software libraries or dependencies. |
| Experiment Setup | Yes | The number of hidden layers ranges from 2 to 7 and the number of units for each layer from 50 to 500. The batch size is in integer multiples of 32 and optimized within [32, 3200]. We search the learning rate between 0.0001 and 0.1. |