Notice: The reproducibility variables underlying each score are classified using an automated LLM-based pipeline, validated against a manually labeled dataset. LLM-based classification introduces uncertainty and potential bias; scores should be interpreted as estimates. Full accuracy metrics and methodology are described in [1].
Machine Explanations and Human Understanding
Authors: Chacha Chen, Shi Feng, Amit Sharma, Chenhao Tan
TMLR 2023 | Venue PDF | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | To validate our theoretical claims, we conduct human subject studies to show the importance of human intuitions. |
| Researcher Affiliation | Collaboration | Chacha Chen* EMAIL Department of Computer Science University of Chicago, Shi Feng* EMAIL Department of Computer Science University of Chicago, Amit Sharma EMAIL Microsoft Research, Chenhao Tan EMAIL Department of Computer Science University of Chicago |
| Pseudocode | No | The paper uses causal diagrams and theoretical frameworks, but there are no sections or figures explicitly labeled as 'Pseudocode' or 'Algorithm', nor are there structured code-like procedures presented. |
| Open Source Code | Yes | Also available at https://github.com/Chacha-Chen/Explanations-Human-Studies. |
| Open Datasets | Yes | Inspired by the Adult Income dataset (Blake, 1998), we choose the task of predicting a person’s annual income based on their profile because people generally have intuitions about what factors determine income but are unlikely to know every person’s income (hence a discovery task). |
| Dataset Splits | No | The paper describes the construction of synthetic data instances (Table 3) and their grouping (A-H) for human subject studies, but it does not specify traditional training/test/validation dataset splits for machine learning model development or evaluation, as the core of their experiments involves human interaction with a simulated model rather than training a new model. |
| Hardware Specification | No | The paper does not mention any specific hardware used for conducting the experiments or running the synthetic model. |
| Software Dependencies | No | The paper does not specify any particular software libraries or versions used for implementing their methodology or conducting their experiments. |
| Experiment Setup | No | The paper describes the design of a human subject study, including how synthetic data was generated and presented, how human intuitions were measured, and the evaluation metrics. However, it does not provide hyperparameters or system-level training settings for a machine learning model, as the 'model' used in their experiment is synthetic and its behavior is dictated by the experimental design to test human interaction, not a trained ML model with tunable parameters. |