Diagnostics-Guided Explanation Generation

Authors: Pepa Atanasova, Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein10445-10453

AAAI 2022 | Conference PDF | Archive PDF | Plain Text | LLM Run Details

Reproducibility Variable Result LLM Response
Research Type Experimental We perform experiments on three datasets from the ERASER benchmark (De Young et al. 2020a) (FEVER, Multi RC, Movies)...
Researcher Affiliation Academia Pepa Atanasova , Jakob Grue Simonsen, Christina Lioma, Isabelle Augenstein Department of Computer Science, University of Copenhagen, Denmark {pepa, simonsen, c.lioma, augenstein}@di.ku.dk
Pseudocode No The paper describes its methods in prose, detailing steps and components, but it does not include formal pseudocode blocks or algorithm listings.
Open Source Code Yes 1We make an extended version of the manuscript and code available on https://github.com/copenlu/diagnostic-guidedexplanations .
Open Datasets Yes We perform experiments on three datasets from the ERASER benchmark (De Young et al. 2020a) (FEVER, Multi RC, Movies), all of which require complex reasoning and have sentence-level rationales.
Dataset Splits No The paper uses standard benchmark datasets but does not explicitly provide specific percentages, sample counts, or citations for how training, validation, and test splits were performed.
Hardware Specification No The paper mentions using 'BERT (Devlin et al. 2019) base-uncased as our base architecture' but does not specify any hardware details (e.g., GPU/CPU models, memory, cloud instance types) used for running the experiments.
Software Dependencies No The paper mentions key software components like 'Transformer' and 'BERT base-uncased', but it does not provide specific version numbers for these or any other ancillary software dependencies.
Experiment Setup No The paper describes the model and training objectives, noting the use of hyperparameters like λ (for sparsity penalty) and K (for word masking), but it does not provide specific numerical values for these or other typical experimental setup details such as learning rate, batch size, or number of epochs.