Mitigating Adversarial Norm Training with Moral Axioms
Authors: Taylor Olson, Kenneth D. Forbus
AAAI 2023 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We evaluate our approach by drawing inspiration from a study commonly used in moral development research. This questionnaire aims to test an agent's ability to reason to moral conclusions despite opposed testimony. Our findings suggest that our model can still correctly evaluate moral situations and learn conventions in an adversarial training environment. |
| Researcher Affiliation | Academia | Taylor Olson, Kenneth D. Forbus Northwestern University taylorolson@u.northwestern.edu, forbus@northwestern.edu |
| Pseudocode | Yes | Definition (CPI: Conditional Principle of Inheritance). If an agent knows that a conjunction (world) is obligatory given certain contextual preconditions, then the agent knows that every more general conjunction (world) is also obligatory in all more specific contexts. (<== (knows Eval Agent Mt y c OBL) (knows Eval Agent Mt x b OBL) (implies x y) (implies c b)) |
| Open Source Code | No | The paper does not provide any statement or link indicating that the source code for the described methodology is publicly available. |
| Open Datasets | Yes | We obtained 133 action descriptions of transgressions paired with their domain type (moral vs conventional) from multiple MCT studies (Aharoni et al. 2011; Kagan and Lamb 1990 Tables 4.2-6). |
| Dataset Splits | No | The paper mentions a training phase ('MCT-Training') and a testing phase ('MCT-Testing') but does not explicitly define or refer to a separate validation dataset split. |
| Hardware Specification | No | The paper does not provide any specific details about the hardware (e.g., GPU/CPU models, memory) used to run the experiments. |
| Software Dependencies | No | The paper mentions 'FIRE reasoning engine (Forbus et al. 2010)' and 'CNLU (Tomai and Forbus 2009)' but does not provide specific version numbers for these or any other software dependencies. |
| Experiment Setup | Yes | The default belief threshold is 0.9. To test the models, we draw upon various instances of the Moral-Conventional Transgressions (MCT) task (Sousa 2009). This task is used in moral development research to test, among others, four important dimensions of norms: permissibility, seriousness, authority contingency, and generality. |