Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in Coakley et alK. L. Coakley, T. Snelleman, H. Hoos, and O. E. Gundersen, "The embrace of open science: An analysis of a decade of AI research and 56 800 conference papers," Under Review, 2026..
User Driven Model Adjustment via Boolean Rule Explanations
Authors: Elizabeth M. Daly, Massimiliano Mattetti, Öznur Alkan, Rahul Nair5896-5904
AAAI 2021 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Experimental Evaluation |
| Researcher Affiliation | Industry | Elizabeth M. Daly, Massimiliano Mattetti, Oznur Alkan, Rahul Nair IBM Research Dublin, Ireland |
| Pseudocode | Yes | Algorithm 1: Generate Response, Algorithm 2: Evaluate Feedback Rules, Algorithm 3: Example of a transformation on a categorical feature when the class label is preserved, Algorithm 4: Example of a transformation on a numeric feature when the class label is changed |
| Open Source Code | No | The paper uses an open-source library AIX360 (footnote 3: https://github.com/Trusted-AI/AIX360) for BRCG implementation, but does not provide access to the code for their own interactive overlay solution. |
| Open Datasets | Yes | We select four well known binary classification benchmarks from the UCI repository TIC-TAC-TOE, BANKNOTE, BANK-MKT and BREAST CANCER. https://archive.ics.uci.edu/ml/datasets/ |
| Dataset Splits | No | The paper states: 'The data is divided into 80% for training and 20% for the holdout test set.' It describes training and test splits but does not explicitly mention a separate validation set split. |
| Hardware Specification | Yes | We implement all algorithms in Python using scikitlearn (Pedregosa et al. 2011) and perform the experiments on a cluster of Intel Xeon CPU E5-2683 processors at 2.00GHz with 8 Cores and 64GB of RAM. |
| Software Dependencies | No | The paper mentions using scikit-learn and the AIX360 library for BRCG implementation, but does not specify their version numbers or the Python version used. |
| Experiment Setup | Yes | For the purposes of these experiments the underlying machine learning algorithm used is a logistic regression with 500 iteration limit1. Numeric features are pre-processed with Standard Scaler and the categorical one with a One Hot Encoder. At each iteration a batch of 10 instances is selected from the pool, labelled by the oracle with the ground truth, and then used for retraining the ML model. |