reproducibilityindex.ai

Transparency Promotion with Model-Agnostic Linear Competitors

Authors: Hassan Rafique, Tong Wang, Qihang Lin, Arshia Singhani

ICML 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Experiments show that MALC offers more model flexibility for users to balance transparency and accuracy...To evaluate the model, we conduct experiments on public datasets and compare MALC with interpretable baseline models. In addition, to study whether MALC is likely to be accepted by users and understand humans preferences for transparency and accuracy, we conduct a human evaluation on a group of 72 subjects.
Researcher Affiliation	Collaboration	Hassan Raﬁque 1 Tong Wang 2 Qihang Lin 2 Arshia Sighani 3 1Program in Applied Mathematical and Computational Sciences, The University of Iowa, Iowa City, Iowa, USA 2Department of Business Analytics, The University of Iowa, Iowa City, Iowa, USA 3BASIS Independent Silicon Valley, San Jose, California, USA.
Pseudocode	Yes	Since numerical optimization is not the focus of this paper, we will simply utilize the accelerated proximal gradient method (APG) by Nesterov (Nesterov, 2013) to solve (2) when φ is smooth. See the algorithm in the Appendix.
Open Source Code	No	The paper does not provide an explicit statement or link for the open-source code of the proposed methodology.
Open Datasets	Yes	We analyze four real-world datasets that are publicly available at (Chang & Lin, 2011; Ilangovan, 2017; Kaggle, 2018; Wang et al., 2017). 1) Coupon (Wang et al., 2017) (...) 2) Covtype (Chang & Lin, 2011) (...) 3) Customer (Kaggle, 2018) (...) 4) Medical (Ilangovan, 2017) (...)
Dataset Splits	Yes	For each dataset, we randomly sample 80% instances to form the training sets and use the remaining 20% as the testing sets. For each model, we identify one or two hyperparameters, and, for each dataset, we apply an 80%-20% holdout method on the training set to select the values for these hyperparameters from a discrete set of candidates that give the best validation performance.
Hardware Specification	No	The paper does not provide specific hardware details (e.g., CPU/GPU models, memory) used for running its experiments.
Software Dependencies	No	The paper mentions software like 'R or python', 'ranger package', 'xgboost package', and 'keras package' but does not provide specific version numbers for these dependencies.
Experiment Setup	Yes	For Random Forest, we use 500 trees and tune the minimum node size and maximal tree depth. For XGBoost, we tune maximal tree depth and the number of boosting iterations. For the neural network, we choose the sigmoid function as the activation function and tune the number of neurons and the dropout rates in the two hidden layers... Overall, we choose C1 from [0.005, 0.95] and C2 from [0.03, 0.25].