Transparency Promotion with Model-Agnostic Linear Competitors
Authors: Hassan Rafique, Tong Wang, Qihang Lin, Arshia Singhani
ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experiments show that MALC offers more model flexibility for users to balance transparency and accuracy...To evaluate the model, we conduct experiments on public datasets and compare MALC with interpretable baseline models. In addition, to study whether MALC is likely to be accepted by users and understand humans preferences for transparency and accuracy, we conduct a human evaluation on a group of 72 subjects. |
| Researcher Affiliation | Collaboration | Hassan Rafique 1 Tong Wang 2 Qihang Lin 2 Arshia Sighani 3 1Program in Applied Mathematical and Computational Sciences, The University of Iowa, Iowa City, Iowa, USA 2Department of Business Analytics, The University of Iowa, Iowa City, Iowa, USA 3BASIS Independent Silicon Valley, San Jose, California, USA. |
| Pseudocode | Yes | Since numerical optimization is not the focus of this paper, we will simply utilize the accelerated proximal gradient method (APG) by Nesterov (Nesterov, 2013) to solve (2) when φ is smooth. See the algorithm in the Appendix. |
| Open Source Code | No | The paper does not provide an explicit statement or link for the open-source code of the proposed methodology. |
| Open Datasets | Yes | We analyze four real-world datasets that are publicly available at (Chang & Lin, 2011; Ilangovan, 2017; Kaggle, 2018; Wang et al., 2017). 1) Coupon (Wang et al., 2017) (...) 2) Covtype (Chang & Lin, 2011) (...) 3) Customer (Kaggle, 2018) (...) 4) Medical (Ilangovan, 2017) (...) |
| Dataset Splits | Yes | For each dataset, we randomly sample 80% instances to form the training sets and use the remaining 20% as the testing sets. For each model, we identify one or two hyperparameters, and, for each dataset, we apply an 80%-20% holdout method on the training set to select the values for these hyperparameters from a discrete set of candidates that give the best validation performance. |
| Hardware Specification | No | The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments. |
| Software Dependencies | No | The paper mentions software like 'R or python', 'ranger package', 'xgboost package', and 'keras package' but does not provide specific version numbers for these dependencies. |
| Experiment Setup | Yes | For Random Forest, we use 500 trees and tune the minimum node size and maximal tree depth. For XGBoost, we tune maximal tree depth and the number of boosting iterations. For the neural network, we choose the sigmoid function as the activation function and tune the number of neurons and the dropout rates in the two hidden layers... Overall, we choose C1 from [0.005, 0.95] and C2 from [0.03, 0.25]. |