Diagnostics-Guided Explanation Generation
Authors: Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein10445-10453
AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We perform experiments on three datasets from the ERASER benchmark (De Young et al. 2020a) (FEVER, Multi RC, Movies)... |
| Researcher Affiliation | Academia | Pepa Atanasova , Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein Department of Computer Science, University of Copenhagen, Denmark {pepa, simonsen, c.lioma, augenstein}@di.ku.dk |
| Pseudocode | No | The paper describes its methods in prose, detailing steps and components, but it does not include formal pseudocode blocks or algorithm listings. |
| Open Source Code | Yes | 1We make an extended version of the manuscript and code available on https://github.com/copenlu/diagnostic-guidedexplanations . |
| Open Datasets | Yes | We perform experiments on three datasets from the ERASER benchmark (De Young et al. 2020a) (FEVER, Multi RC, Movies), all of which require complex reasoning and have sentence-level rationales. |
| Dataset Splits | No | The paper uses standard benchmark datasets but does not explicitly provide specific percentages, sample counts, or citations for how training, validation, and test splits were performed. |
| Hardware Specification | No | The paper mentions using 'BERT (Devlin et al. 2019) base-uncased as our base architecture' but does not specify any hardware details (e.g., GPU/CPU models, memory, cloud instance types) used for running the experiments. |
| Software Dependencies | No | The paper mentions key software components like 'Transformer' and 'BERT base-uncased', but it does not provide specific version numbers for these or any other ancillary software dependencies. |
| Experiment Setup | No | The paper describes the model and training objectives, noting the use of hyperparameters like λ (for sparsity penalty) and K (for word masking), but it does not provide specific numerical values for these or other typical experimental setup details such as learning rate, batch size, or number of epochs. |