reproducibilityindex.ai

Demystifying Black-box Models with Symbolic Metamodels

Authors: Ahmed M. Alaa, Mihaela van der Schaar

NeurIPS 2019 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable	Result	LLM Response
Research Type	Experimental	Building on the discussions in Section 4, we demonstrate the use cases of symbolic metamodeling through experiments on synthetic and real data. In all experiments, we used Sympy [19] (a symbolic computation library in Python) to carry out computations involving Meijer G-functions5.
Researcher Affiliation	Academia	Ahmed M. Alaa ECE Department UCLA ahmedmalaa@ucla.edu Mihaela van der Schaar UCLA, University of Cambridge, and Alan Turing Institute {mv472@cam.ac.uk,mihaela@ee.ucla.edu}
Pseudocode	Yes	Algorithm 1 Symbolic Metamodeling Input: Model f(x), hyperparameters (m, n, p, q, r) Output: Metamodel g(x) G Xi Unif([0, 1]d), i = {1, . . ., n}. Repeat until convergence: ......θk+1 := θk γ θ P i ℓ(G(Xi; θ), f(Xi)) θ=θk g(x) G(Xi; θk) If g(x) / G: ...... g(x) = G(x; θ), G(x; θ) G, θ θk < δ, or ...... g(x) = Chebyshev(g(x))
Open Source Code	Yes	The code is provided at https://bitbucket.org/mvdschaar/mlforhealthlabpub.
Open Datasets	Yes	Using data for 2,000 breast cancer patients extracted from the UK cancer registry (data description is in Appendix B), we ﬁt an XGBoost model f(x) to predict the patients 5 year mortality risk based on 5 features: age, number of nodes, tumor size, tumor grade and Estrogen-receptor (ER) status.
Dataset Splits	Yes	Using 5-fold cross-validation, we compare the area under receiver operating characteristic (AUC-ROC) accuracy of the XGBoost model with that of the PREDICT risk calculator (https://breast.predict.nhs.uk/), which is the risk equation most commonly used in current practice [41].
Hardware Specification	No	No specific hardware details (e.g., GPU/CPU models, memory, or computational resources) used for running experiments were mentioned in the paper.
Software Dependencies	No	In all experiments, we used Sympy [19] (a symbolic computation library in Python) to carry out computations involving Meijer G-functions5. (No version numbers provided for Sympy or gplearn). We also used the gplearn library [40].
Experiment Setup	No	The paper mentions fitting a '2-layer neural network f(x) (with 200 hidden units)' and an 'XGBoost model f(x)' but does not provide specific hyperparameters or detailed training configurations (e.g., learning rate, batch size, optimizer settings).